In this talk, Professor Ding gives an inspiring talk about data science’s present situation and future. She points out that with the advancement of technology, the production and access to large-scale data sets have fundamentally changed how people think and how they live. At the same time, it also brings lots of valuable opportunities and challenges for data scientists.
Researchers coming from different fields might focus on different levels of problems. Methodologists coming from mathematics and physics tend to look at things at a macro-level, trying to extract some rules or build general models to explain the complicated, dynamic, stochastic and messy data patterns. Some researchers pay much attention to issues at micro-level, such as in recommendation system, they investigate each individual’s behaviors and preferences. While, data scientists prefer to resolve problems arising from meso-level, playing the role of intermediators connecting micro and macro levels.
Specifically, Prof. Ding presents some recent work and obtained results by her research group, mainly about bibliometrics (or science of science), drug protein target interactions, and innovation diffusion. Bibliometrics is statistical analysis of digital publications, such as papers, articles, books and news. For example, given an academic publication corpus, they can mine knowledge with respect to collaboration, citation as well as research topic shifts over time etc. Besides, her research group is also interested in predicting drug protein interactions based on semantic network analysis. The semantic network integrates chemical, pharmacological, genomic, biological, functional, and biomedical information, in which nodes and edges are extremely heterogeneous. Due to its heterogeneity, they examine meta-path-based topological patterns to predict potential drug-protein links. Details can be seen in their recent publication entitled “Predicting drug protein target interactions using meta-path based semantic network analysis”. In addition, Prof. Ding also talks about their work on innovation diffusion. They mine a whole collection of publications during the past decade, aiming to check out how LDA algorithm diffuses across diverse domains. They find that the early adopters of one innovation are responsible for the final spreading scale. It gives researchers some good insights of how to better spread their ideas and work.
Details About Talk
Title: Data-Driven Science of Science
Speaker: Ying Ding
Bio: Dr. Ying Ding is an Associate Professor at School of Informatics and Computing and is currently associate director of data science online program at Indiana University. She has been involved in various NIH, NSF and European-Union funded projects. She has published 190+ papers in journals, conferences, and workshops, and served as the program committee member for 180+ international conferences. Her current research interests include data-driven knowledge discovery, Semantic Web, knowledge graph, scientific collaboration, and the application of Web Technology.