Test metrics and calculations

When running any A/B/n test, it's important to ensure the test runs long enough to gather sufficient data and make accurate conclusions. To do this, you need to achieve statistical significance and meet the minimum sample size required in your test. These factors provide confidence and enough evidence that the observed differences between the control and the variants are not due to chance.

Statistical significance

Statistical significance is the likelihood that the result of an A/B/n test is not due to chance, determining whether the differences between variants are valid or just random fluctuations. Sitecore ensures that any improvements shown in an A/B/n test are attributable to actual changes made in the variant. If a winning variant has not been declared, it might be because the test has not reached statistical significance.

The common standard of statistical significance is 95%. This means that you can be 95% confident that the results are accurate and did not occur by mere luck or chance. If you repeated the test 100 times under the same conditions, you would expect the same result 95 times.

Minimum sample size

A sufficient sample size enables reliable statistical conclusions from an A/B/n test. The test must reach this minimum sample size to achieve statistical significance.

Sitecore automatically calculates the minimum sample size required based on the test's goal. If this threshold number for visits is not met, you cannot be certain that the results did not occur due to random factors. You can edit the minimum sample size by adjusting the following parameters:

Base rate - the current conversion rate of the control variant. This rate is not known exactly before running the test. It's best practice to refer to baseline conversion rates from historical marketing campaigns or A/B/n tests for guidance. By default, this rate is set at 2%. Increasing the base rate decreases the required minimum sample size.
Minimum detectable difference - the smallest amount of change or lift from the base rate you want the test to detect. The default value is 20%, but you can adjust this to change the test sensitivity. Increasing this value decreases the required minimum sample size. This also means that the test will not be as sensitive, and will only show more drastic results.
Confidence level - the required amount of certainty to confirm that differences between variants are statistically significant. A confidence level of 95% is an accepted standard for reaching statistical significance. Increasing the confidence level increases the required minimum sample size. Only do this if you need more accurate test results and you are prepared to extend the time it will take for the test results to be conclusive.

In the following image, Sitecore calculates a sample size of 21,110 visits per variant using the default parameter values.

The default minimum sample size is calculated using default values for base rate, minimum detectable difference, and confidence level.

Important

Do not stop the test before reaching the required minimum sample size, because this will invalidate the test results.

Declaring a winner

Sitecore uses metrics to determine the winner based on the goal set for the A/B/n test. By defining this goal, Sitecore can automate the analysis and identify the best-performing variant.

For Sitecore to declare a winner, the test must meet the criteria for minimum sample size, detectable difference, and confidence level. If the test reaches the minimum sample size but fails to meet the other criteria, Sitecore will consider the test inconclusive, indicating no significant difference between the variants.

A winner is declared only when all three criteria are satisfied. The winning variant is the one with the highest uplift compared to the control variant. If multiple variants are tested, Sitecore declares the variant with the greatest uplift as the winner.

If you have suggestions for improving this article, let us know!