My Blog List

Sunday, November 6, 2016

Tackling Anomaly Mining Problems in the Wild with Networks and Beyond

Summary

In this talk, Professor Akoglu talks about anomaly mining, which is a critical research topic with significant applications across broad domains, such as in security, online commerce, finance, city computing and medicine etc. Prof. Akoglu points out that there are three major challenges in anomaly mining – (1) what the Definition of an anomaly is, (2) how to Detect anomalies, as well as (3) offering reasonable Descriptions of the detected anomalies, which she called three ‘D’s. It can be seen that anomaly mining, compared to anomaly detection, is a more generalized research topic, also incorporating descriptions and explanations regarding anomalies.

Prof. Akoglu’s research focuses on building new models and methods for anomaly mining in the real world, and addressing issues arising from data’s characteristics – high speed, large scale, multiple dimensionality, sparsity and interpretability. She presents two recent and representative work. The first one is finding anomalous neighborhoods in social networks. The paper “Scalable Anomaly Ranking of Attributed Neighborhoods” is published in SIAM SDM, 2016. To be specific, given an attributed network, they propose a new quantity called “normality” to measure how normal or how anomalous one neighborhood are. It not only takes into account topological information but also incorporate nodes’ attributes to quantify internal consistency and external separability.

The second work is about spotting suspicious host-level activity from system logs. The paper entitled “Fast Memory-efficient Anomaly Detection in Streaming Heterogeneous Graphs” got best research paper runner-up award published in ACM SIGKDD, 2016. The problem is, given a stream of heterogeneous networks of different nodes and edges, how to spot anomalous ones in a fast, online and memory-efficient way. They propose a clustering based algorithm called “StreamSpot” by introducing a similarity function for heterogeneous graphs comparing their relative frequency of local substructures represented as short strings. StreamSpot shows higher than 95% accuracy with small delay. By these two examples, finally, she claims that anomaly mining tasks are application-dependent, we might encounter different challenges and propose different methods in different applications.

Other information:

Speaker: Leman Akoglu

Bio: Leman Akoglu is an assistant professor of Information Systems at the Heinz College of Carnegie Mellon University. Between 2012-2016, she was an assistant professor at Stony Brook University, prior to which she received her Ph.D. from the Computer Science Department at Carnegie Mellon University. At Heinz, she directs the Data Analytics Techniques Algorithms (DATA) Lab. Her research interests are in data mining and machine learning topics with a focus on algorithmic problems arising in graph mining, pattern discovery, social and information networks, and especially anomaly mining; outlier, fraud, and event detection. More details can be found at her homepage.

No comments:

Post a Comment