Thursday, December 8, 2016

A/B Testing at Scale


In this talk, Mr. Dmitriev gave an introduction to controlled experiments, shared four real experimental examples with us and discussed five challenges they have encountered at Microsoft.

A/B test is the simple controlled experiment. A set of users are randomly divided into two groups – one is control group and the other one is treatment group. People in control group will continue to use existing system, while new features will be added into the treatment group. By analyzing and comparing the outputs of two groups, researchers decide whether they can detect statistically significant difference between the two groups.

A/B tests play crucial role in today’s evolving product development process. Traditional software development is divided into separate steps: design, development, test and release. However, more and more attention has been paid on customer-driven development cycle: build the system, implement experiment, measure user behaviors and learn the feedbacks. Here comes a question – why is A/B test is necessary? Mr. claimed that it is because people are poor at assessing the value of our ideas. What we are thinking will turn out not to be the truth.

Mr. shared with us some interesting experiments implemented at Microsoft platforms such as Office, Windows 10, Bing, skype, OneNote, msn, exchange, visual studio and internet. Based on the experiments at Microsoft, 1/3 of ideas were positive and statistically significant, 1/3 of ideas were flat and the rest 1/3 of ideas were negative and statistically significant.

Finally, Mr. talked about five challenging problems in A/B tasks. They are trustworthiness, protecting users, OEC, violations of classical assumptions, and how to analyze results. More interesting talks and publications in this domain can be found in Experimental Platform (EXP):

About the talk:
Speaker: Pavel Dmitriev, Principle Data Scientist, Microsoft
Date: Dec 8, 2016

