My Blog List

Thursday, December 15, 2016

New online ecology of adversarial aggregates ISIS and beyond.

With a daily granularity, Johnson et al. collect large-scale datasets consisting of longitudinal records of online activities of pro-ISIS aggregates (pages) on VKontakte, a social network in Russia. First of all, they propose an approach to predict the onset time of real-world events by exploiting the proliferations of aggregates ahead of such events. Secondly, they build a fragmentation-coalescence model to capture the shape-skin pattern that the aggregate size has shown across time. Finally, they analyze the adaptations of aggregates have performed to escape from anti-ISIS entities. This work helps to forecast the favorable conditions for future attacks and it only needs digital data collected from social media, instead of relying on real-world events.
Due to the high speed and global coverage of social networks, extremist organizations like ISIS start to use them as platforms to propagate information and recruit followers. Prior studies attempt to reveal the relationship between general online buzz (such as mentioning ISIS or protests) and real-world events. Unfortunately, such individual behaviors are insufficient to predict sudden attacks or clarify any long-term buildup stage ahead of those events. Therefore, it leaves the issues unresolved, like how to explain how support for extremist organizations evolve in social networks, and how to forecast a sudden attack.
They choose VKontakte, an analog of Facebook based in Russia, to collect longitudinal records of pro-ISIS activities and narratives. Instead of collecting casually online buzz, they pay close attention to self-organized pro-ISIS aggregates, an analog of pages in Facebook. Those aggregates are under huge pressures from predatory entities such as police cybergroups and individual hackers.
They manually build up a list of pro-ISIS aggregates. On a daily basis, they update the list by identifying all relevant narratives using common hashtags in multiple languages, and then trace them down to find aggregates. Only those with a strong allegiance to ISIS are included into the list. The criterion of inclusion is the group explicitly expressed its support for ISIS, and judgment is made by some experts. Once the aggregates supporting ISIS are found, an additional search using their followers and aggregates to whom they linked is performed on that same day. Such iterative process is terminated until the search leads to aggregates that have already been added into the lists. In addition to maintaining the list of aggregates, they also develop a web scraping software to help collect additional information – such as followers, posts in aggregates and comments, the Boolean variables to indicate whether the aggregate is alive or has been shut-down.
Methods & Results:
Proliferation of aggregates before real-world events.
They employ a well-known Moore’s Law of development  to fit the online creations of aggregates, in which  is the time-interval between the appearance of the  and aggregates and  is the time-interval between the first two, the escalation parameter  is positive and diverge over time ( is function of ), indicating an increasing escalations of online proliferation of aggregates. Next, they use the inverse algebraic formula  to estimate b’s divergence with  matching the actual onset of attacks. Real data fitting suggests that the divergence of escalation parameter for aggregate proliferations coincides with real-world onset at time . By contrast, neither time-series analysis of online buzz nor prior on-street events provide any long-range predictive power. This method is unable to predict those attacks conducted by a few individuals.
Coalescence-fragmentation model for online aggregate ecology.
The sizes of aggregates exhibit distinctive shark-fin shapes – expansion followed by an abrupt drop (due to the shutdown of aggregates). They take two factors – coalescence and fragmentation – into account and construct a model to capture the system-level shark-fin feature. Fragmentation means the abrupt drop of aggregates caused by predators, and coalescence indicates that individual followers sporadically link into existing aggregates (like growth of one aggregate) while existing aggregates sporadically link into each other (combination of two aggregates). Mathematical analysis reveals that aggregate size in steady-state follows a power-low distribution  with . This is similar to the empirical value  for ISIS data.
They also show that predatory agencies can thwart development of large aggregates by breaking down smaller ones. They modify the model by introducing time delay to describe time cost that aggregates need to be noticed, analyzed and finally shut down. Simulation results show that breaking down smaller-sized aggregates is more efficient than hunting large ones.
Evolutionary adaptations.

They found that aggregates show various evolutionary adaptations to protect themselves from predatory entities, by changing names, becoming invisible and reincarnation.

Sequential Ensemble Learning for Outlier Detection: A Bias-Variance Perspective


In this paper, Rayana et al. develop a new sequential ensemble method called Cumulative Agreement Rates Ensemble (CARE) for outlier detection. As we know, ensemble methods have always been widely used in classification problems to gain the aggregated strength of many base models. Until recently, there are some work [1,2,3] exploring ensemble methods to this field - anomaly detection.
Inspired by Aggarwal and Sathe’s work [2], they consider outlier detection problem as a binary classification task where labels are unknown, the inliers being the majority class and the outliers being the minority class. The detection error can be decomposed into bias and variance, but existing outlier ensembles are following in parallel framework to combine multiple base detector’s outcome to only reduce variance. They claim that their sequential ensemble is able to not only reduce variance but also bias.
Specifically, the main steps in CARE outlier ensemble are as following:
·      Firstly, they create multiple feature-bagging outlier detectors as base detectors of the ensemble. Two versions of CARE are constructed using (a) distance-based approach AveKNN and (b) density-based approach LOF.
·      In each iteration, there are two aggregation stages to reduce variance: (a) in the first stage, they combine all the results from base detectors by weighted aggregation instead of binary selection of base detectors. To obtain the weights, they first estimate errors through the unsupervised Agreement Rates [4], and then assign weights inversely proportional to the corresponding errors; (b) in the second stage, they aggregate the results of the current iteration with all previous iterations to compute outlierness scores.
·      Before going into the next iteration, they first remove some outliers based on outlierness scores obtain from the last iteration to reduce bias. In particular, they sample a subsample from the original data using Filtered Variable Probability Sampling (FVPS) and use the sample as inputs for next iteration.
·      Iterations (2-3) will be stopped until stopping condition is satisfied.

The main advantage of CARE is that its filtered sampling reduces both bias and variance more than any other procedures. Results show that CARE beats AveKNN and LOF based baselines across many datasets, yet it becomes more difficult for CARE to outperform all the state-of-the-art ensembles in all the datasets.

Social networks under stress

Romero et al. aim to analyze how an organization’s social network changes in structure and communications as responses to external shocks. The authors analyze the complete instant-message (IMs) communication history among employees of a hedge fund and with outside contacts. Using such information, they define structural, affective and cognitive properties of social networks. They find that, faced with the external shocks in the form of extreme price change, the network tends to turtle up instead of open up, favoring strong ties, high clustering and communications among company insiders. Besides, they also show that network structure is a better prediction than stock prices for important behavioral patterns, including affective and cognitive communications, local optimality of transactions, and the sudden execution of new stocks. This work reveals a strong relationship between network structure and collective behaviors within a social network.
A lot of researches have been done regarding static network structure, communities, influential nodes in networks as well as spreading dynamics in stable networks. However, little work has focused on how network responses to external shocks, through changing structure or communicative properties among nodes. The shock-related questions are critical to understanding a networked system’s capability to deal with uncertainty, or even extreme events.
1.     Network structure features. Subgraph G(s,d) is constructed by following internal IMs mentioning stock s on the day d. A larger subgraph G+(s,d) containing G(s,d) is built by considering both of internal and border IMs.
a.     Clustering coefficient
b.     Strength of ties
They make use of the historical communication records prior day d. For each node x, they can rank the set of nodes y with whom x has contacted before day d in descending order. They use the fraction of edges connecting the most frequently-contacted nodes in G(s,d) to evaluate whether G(s,d) favors strong ties or weak ties.
c.     Percentage of border edges
They define openness O(s,d) as the fraction of edges in G+(s,d) that are border edges.
Note that: measures can be normalized in relation to comparable quantities in data.
2.     Shock is defined as D(s,d) = [b(s,d)-a(s,d)]/a(s,d), where a(s,d) and b(s,d) is the opening and closing prices of stock s on day d; x-shock means that a stock’s price change on day d is higher than x while its price change was lower than x on the previous three days.
3.     OLS regression. In regression, they disaggregate the analysis on stock-by-stock and industry-by-industry basis, and they also include the fixed effect variable for day of the week.
4.     Linguistic Inquiry and Word Count (LIWC) dictionary. They employ LIWC to identify words in IMs that reflect affective and cognitive information of traders. Affect includes positive emotion, negative emotion, anxiety, anger and sadness; Cognition contains insight, causation, discrepancy, tentative, certainty, inhibition, inclusive, and exclusive.
5.     Prediction task. They use binary classifiers to predict whether (s,d) confirms one LIWC category using network structure features and the stock price changes as predictors. They say the pair (s,d) confirms one LIWC category C if the words from C are used at a higher rate on day d than typical rate for stock s.
6.     A buy transaction is locally suboptimal if the price of a stock on that transaction is higher than the maximal price on the following day. A sell transaction is locally suboptimal if the price on that transaction is lower than the minimal price on the following day.
7.     Prediction task one. They predict whether a transaction t(s,d) is locally optimal using network features and price changes as predictors.

8.     Prediction task two. They predict new transactions of stocks that have not been traded for a given period of time prior to the transaction day.

Thursday, December 8, 2016

A/B Testing at Scale


In this talk, Mr. Dmitriev gave an introduction to controlled experiments, shared four real experimental examples with us and discussed five challenges they have encountered at Microsoft.

A/B test is the simple controlled experiment. A set of users are randomly divided into two groups – one is control group and the other one is treatment group. People in control group will continue to use existing system, while new features will be added into the treatment group. By analyzing and comparing the outputs of two groups, researchers decide whether they can detect statistically significant difference between the two groups.

A/B tests play crucial role in today’s evolving product development process. Traditional software development is divided into separate steps: design, development, test and release. However, more and more attention has been paid on customer-driven development cycle: build the system, implement experiment, measure user behaviors and learn the feedbacks. Here comes a question – why is A/B test is necessary? Mr. claimed that it is because people are poor at assessing the value of our ideas. What we are thinking will turn out not to be the truth.

Mr. shared with us some interesting experiments implemented at Microsoft platforms such as Office, Windows 10, Bing, skype, OneNote, msn, exchange, visual studio and internet. Based on the experiments at Microsoft, 1/3 of ideas were positive and statistically significant, 1/3 of ideas were flat and the rest 1/3 of ideas were negative and statistically significant.

Finally, Mr. talked about five challenging problems in A/B tasks. They are trustworthiness, protecting users, OEC, violations of classical assumptions, and how to analyze results. More interesting talks and publications in this domain can be found in Experimental Platform (EXP):

About the talk:
Speaker: Pavel Dmitriev, Principle Data Scientist, Microsoft
Date: Dec 8, 2016

Saturday, November 19, 2016

Automatically Extracting Topical Components for a Response-to-Text Writing Assessment


In this talk, Rahimi introduces her recent work about automatically extracting topic components from source materials. Such source-based topic component extraction can be a replacement of manual efforts performed by experts and provides convenience for automatic assessment process.

Rahimi starts from the application end. To address the issue of automatic essay scoring, many prior approaches have been provided, such as bag of words, semantic similarity, content vector analysis and cosine similarity. However, many of them do not take source materials into consideration. Rahimi points out that, different from those prior work, their research highly relies on source materials, lying in the domain of response-to-text writing assessment. Given source materials, how to automatic evaluate students’ essays? Rahimi and her colleagues approach this problem by localizing pieces of evidence in students’ essays that match source materials. Instead of manually extract those evidence by experts, they aim to offer an automatic way to find such evidence.

To be specific, they use natural language processing techniques to automatically extract a comprehensive list of topics from source materials. The list of topics consists of topic words as well as specific expressions (N-grams) that students should include in their essays, also defined as “topic components”. Table 1 gives us a direct illustration about topic components.

Table 1. Automatically extracted topic words and N-gram expressions for each topic. They are extracted by the proposed data-driven LDA-enabled model.

To evaluate the performance of automatic extraction of topic components. Rahimi compare their method with manual results and other competing baselines. Results are shown in table 2. It shows that their proposed method is very promising and outperforms all other models. However, compared with manual upper bound, they still have much improvement space.
Table 2. Performance of models using automatically extracted topical components, baseline models, and manual upper-bound. 
About the talk:

Talk URL:
Speaker: Zahra Rahimi
Date: Nov 18, 2016