Split testing

Recommended split test process

In Sitecore Discover, a split test refers to an experiment where two or more variations of a widget are tested against each other. A structured process for split tests gives clarity, confidence, and insight.

To run split tests in Discover, we strongly recommend the following process:

Starting with a test hypothesis
Identifying variables
Predicting an outcome
Supporting the predicted outcome with a rationale
Running the spilt test

The following sections detail each step of our recommended split test process.

Starting with a test hypothesis

Ideally, defining a test hypothesis is the first step of running any experiment. A strong hypothesis is as important as interpreting the statistical significance of your results.

To create a hypothesis, you can begin with answering the following questions:

What do you expect to be different between variations when you test them? What is changing or variable?
Based on your research, both quantitative and qualitative, what do you predict is going to be the outcome of your test?
How do you expect to explain the outcome if it matches or does not match your predicted outcome?

You can start with analytics to identify low-performing pages or conversion funnels. This can lead you to some elements to build your hypothesis on. You must be aware that the results you get from running the experiment might not prove your hypothesis.

Identifying variables

For a widget variation, the variable in your hypothesis can be a context rule, a setting or a merchandising rule. When modified, added, or taken away, they produce different predicted outcomes.

To help you analyze the results, try to isolate a single variable for a split test, or select a handful of variables for a multivariate test. Remember that more variables you include in a test, the more difficult it is to understand or explain why your hypothesis was proven or not proven.

The following are some typical variables:

Recommendation widget recipe.
Pinned, boosted, buried, or blacklisted products in a merchandising rule.
Product attribute weights determining if a product is in or out of search results set for a given keyword.
Search rank in product results for a given keyword.
Content of banners, calls to action, visual media, messaging, and forms.

Predicting an outcome

The result, in the case of your hypothesis, is the predicted outcome. This can be an increase in home page recommendation conversions, clicks on a given keyword's search results, click through rates or some other KPI or metric.

Base the predicted outcome on available and current performance data. Make sure you establish the baseline of various metric or metrics against which you compare results. Also note if you expect the variable change to produce an incremental or a large-scale effect.

For example, you test for greater conversions with a lower bounce rate. You must also establish before the test what amount of reduction or increase in bounce rate is acceptable given the predicted increase in conversion rate.

Supporting the predicted outcome with a rationale

The rationale of your hypothesis demonstrates that you can back your hypothesis with research. The validity of your hypothesis reflects the information you have of your visitors from both qualitative and quantitative research.

Use numerical or intuition-driven insights to formulate the why behind a test. Solicit inputs from customers through interviews or other qualitative tools like surveys, heat maps, and user testing. Spend time understanding how visitors interact with your website or application.

When you define the rationale of your hypothesis, note down what you can learn from running the test. What are the likely takeaways?

For example, say you want to test a new ranking setup for a given category page. You test the rationale whether moving the newest products in the catalog to the top of the list, results in increased conversions because repeat visitors always return to category pages for latest products.

Running the split test

After defining a test hypothesis and before creating the split test, you must add widget variations to the widget being tested, that reflect the test scenarios, namely baseline and changed cases.

A split test splits the traffic to each test scenario and collects their performance data.

In Discover, when a test ends, the widget's default variation moves back to the Active state unless another variation is scheduled.

Note

Discover does not analyze the results nor declare a winner.

You need to review the KPI of each variation. If you prefer a particular variation, set it as the default variation of the widget.

For specific steps to create a split test, see Split test recommendation widget recipes.

If you have suggestions for improving this article, let us know!