Codecademy A/B Test

Austin De Witt
Jul 29, 2018
3 min read

#python #Regression #ABtest #marketinganalysis #dataanalysis #dataanalytics #LogisticRegressin #ChiSquare

All files and associated code can be found here: My Github

Summary

This article covers the attempted salvage of an improperly run A/B Test for two "Ready" Premium Feature ads.

While the new ad did have a higher CTR than the control ad, the two user groups were improperly randomized. Due to this two attempts were made to save the test. First, a step-down, multiple logistic regression analysis was run to determine statistical significance of non randomized variables. Second, the data was filtered so that both user groups were similar.

Situation

Codecademy would like to increase the click through rate (CTR) of their Ready premium feature advertisement.

This project uses two data sets, the first contains information on users who saw the original ad. The second file contains information on users who saw the new version of the banner ad.

Below are the two ads, the left is the original ad located on the lessons page. On the right is the new ad located on the signup page. The ads differ in page location, content, and call-to-action. The new ad contains additional information on the length and price. The call-to-action has been changed from "LEARN FASTER" to "TAKE YOUR FIRST STEP." Otherwise the two ads are very similar.

The python code is a full exploratory analysis and statistical analysis of the data. The data sets are both relatively clean, however it is always advisable to check the extents and values for all the columns for yourself.

Data Exploration

A major issue that stands out when exploring the data is the two populations were not properly randomized in the study design. A/B testing should randomize the populations who view each ad to eliminate confounding variables so that the only difference is which ad the user saw.

After making this finding, there are two options going forward:

1. Determine if the number of exercises completed plays a significant role

2. Attempt to use a subset of the control group that matches the new ad group

Logistic Regression: Impact of Exercises Completed

Ultimately a down stepping, multivariate, logistic regression analysis showed that the number of exercises impacted CTR.

The initial regression showed that the coefficient for 'num_exercises' was not statistically significant. By running a second regression without 'Group,' the 'num_exercises' became highly significant suggesting collinearity between the two. This is an issue and means that we cannot factor out the influence of user experience on the click through rate.

Salvage Attempt Two: Filtering Control Users

A second option to try and save the botched A/B test design is to filter out the control group so that it only contains users similar to the new advertisement group. However, this drops the observations from over 10k to under 400. Because the data is binary, either a 1 or 0, we use a chi-squared test to compare populations. With the drastically lower number of observations we are unable to reject the null with a p-value above the 0.05 threshold.

Conclusion

This project was an A/B test for click through rates given two different banner ads on Codecademy's website. Because of improper study design, attempts were made to salvage the project. Unfortunately these methods were unsuccessful.

Going forward, I would suggest one of two options:

1 Keep the current ad and re-run the statistical analysis if the funds and time allow and ensure users are properly randomized.

2 Adopt the new ad with caution and continue to monitor results. There is SOME evidence it may have a higher CTR and no evidence it under performs.

The key takeaway is the importance of properly designing an A/B test so that time and money is not wasted and conclusive statistical results can be drawn.