a/b test

A/B testing: In statistics we trust

Although the simple mention of the word ‘statistics’ might make your skin crawl, statistics are an essential tool of A/B testing. We created a simple cheat sheet to help you navigate the basic terminology of statistics with A/B testing.

1 – Populations and samples

In statistics, we want to find results that apply to an entire population of people or things. However, in most cases, it is impossible to collect data from the entire population. This is why we collect data from a small subset of the population, called a sample.

2 – Mean

The mean is the simple mathematical average of a set of numbers. If you, for example, sell lamps and your customers’ purchases are as follows:

  • Customer A: purchased 2 lamps
  • Customer B: 2 lamps
  • Customer C: 3 lamps
  • Customer D: 2 lamps

You would calculate the mean by adding the values (lamps purchased) and dividing that sum by the number of values measured (number of customers): (2+2+3+2)/4 = 2.25.

3 – Standard deviation vs. standard error

These two metrics are very often mixed up. In short, standard deviation is about your data and standard error is about your sample.

The standard deviation is a measure of how well your mean represents your data. It indicates how much members differ from the mean value of the group. Conversely, the standard error tells you how well your sample represents the total population.

A/B testing: In statistics we trust

4 – Confidence intervals

Confidence intervals are boundaries within which we believe the true value of the mean will fall. The purpose of confidence intervals is to establish a range within which values of a population will fall with certain probability. Usually, we look at 95% confidence intervals and sometimes 99%. This means that we expect 95% or 99% of values to fall within the range of the confidence interval.

A/B testing: In statistics we trust

5 – Null Hypothesis vs. experimental hypothesis

An experimental hypothesis is the prediction that your experimental manipulation will have some effect or that certain variables will relate to each other. In contrast, the null hypothesis is the prediction that you are wrong and that the predicted effect doesn’t exist. In other words, the experimental hypothesis is something a researcher tries to prove whereas the null hypothesis is something the researcher tries to disprove.

A/B testing: In statistics we trust

6 – Type I and Type II errors

A type 1 error, also called a false-positive, occurs when we believe that there is an effect in our population, when in fact there isn’t. The opposite is a Type II error, a false-negative, which occurs when we believe that there is no effect on the population when in reality there is.

7 – p-value

The p-value is the probability of observing an outcome equally or more extreme than the one observed in the test, assuming that the null hypothesis is true. The smaller the p-value, the more certain we are that we should reject the null hypothesis.

8 – Statistical Significance

The result is statistically significant when the p-value is smaller than the significance level. The significance level (𝛂) is the probability of rejecting the null hypothesis when it is actually true. In other words, it is the probability of wrongly rejecting the null hypothesis. For example, a significance level of 0.05 indicates a 5% risk of making the wrong assumption: rejecting the null hypothesis when, in reality, it is true.

9 – Statistical Power

Statistical power is the probability of observing a statistically significant effect if there is indeed one. In other words, it allows you to detect a difference between test variations. That is, if that difference actually exists.

We hope this A/B testing blog will help you along! Part 1 of our A/b testing blog can be found here.

Feel like reading more data stories? Then take a look at our blog page. Always want to stay up to date? Be sure to follow us on LinkedIn!

Need some help?

2022-01-12MRF (1355)

Sophie Caro

“Data is one of the most valuable assets you can have. I love contributing to companies’ growth and development by helping them turn their data into business insights.”

More Data stories

dataform blog 2
Data stories

Getting started with Dataform: tips and tricks to make you more confident

Now that we have decided to switch from Dataform to DBT, how do we get started? Starting to use a new tool always comes with some hurdles. We definitely came...
Screenshot 2023-09-13 at 11.21.48
Data stories

Choosing the right path: following the Dataform trail

In the dense forest that data science can often be, finding a way through can be cumbersome. Although this forest is not made up of decision trees, making choices is...
Mike the headless chicken
Data stories

You should filter out useless traffic, here’s how to!

Useless traffic, be it bot or developer traffic, can be a major issue for data analysts. Certain traffic can be considered useless if it is not representative of actual user...
TDS_part2
Data stories

Simplifying Machine Learning: Less is More - PART 2

In part 1 on simplifying machine learning we talked about the pitfalls of complex models and why opting for simpler models is more often than not the right thing to...
Machine Learning
Data stories

Simplifying Machine Learning: Less is More

As businesses across the globe accelerate their digitalisation efforts, they are increasingly captivated by the power of artificial intelligence (AI) and machine learning (ML). Companies, especially those new to AI...
Analytics for a Better World
Data stories

Analytics for a Better World

I, Sophie, attended the Analytics for a Better World (ABW) annual conference at Amsterdam Business School on May 24th. This event brought together speakers and panelists from different groups: nonprofits,...
en_USEnglish