
How to do SEO Testing
SEO Split testing is very different to other forms of testing such as engagement or conversion a/b tests. The difficult part about SEO testing is proving it has positional benefit and isn’t driven by demand. Not many SEO’s talk about split testing, but this guide will openly share insights, and give you two ways to start SEO testing, so you can show benefit to your manager and senior stakeholders.
SEO Test Measurement
Organic Traffic needs to be the primary measurement metric and position needs to be the secondary measurement metric. In this guide we will go through what to test, how to validate test results and the formats to test using a Bootstrap casual impact style interference testing method and a simpler method most SEO’s use.
SEO testing involves experimenting with on-page and potentially off-page elements to understand the impact on rankings and traffic. This approach helps SEO’s scale changes and get buy in from senior stakeholders .
What do SEO’s test?
Popular tests include:
- H1 & Meta Title changes (Format, freshness, Call to actions or keyword semantics)
- Adding Emojis into meta data
- New Features/Content structure changes (Tables, Lists, FAQS)
- Schema (FAQ, Trip, Product, Review etc)
- Additional Content
- Changing of HTML elements, usually other content headings (Div-> h2/h3)
- Default Templates VS Custom Templates
- Internal Linking
SEO Testing Ideas
Testing Meta Titles
SEO Freshness
Adding freshness in meta titles is often a very positive test from previous experience and as shared by other industry tools. An example of this would be adding 2025, New or Updated to your meta titles.
For example:
- Shop Midi Dresses | 2025 Trends | Brand Name
- Women’s Midi Dresses | NEW Midi Dress Styles | Brand Name
Keyword Semantics
Another common test that often sees really positive results is adding the top 3 queries by page from Google search console into the meta title. It’s important that these are quite different, at times pages might have the same term in a different variation, ignore this and look at volume terms to fill in any gaps.
For example:
You have a list of the top 6 terms for a page – London New York Flights. You can see below the first two are covered by the main term/context of the page so include 3 more others.
Let’s say your top 6 queries are:
- London New York Flights
- London New York
- London New York flight duration
- London New York flight price
- London New York flight times
- London New York cheap flights
Your title could be:
- London New York Flights | Duration, Prices & Times | Brand Name
Testing FAQ Content
Adding additional FAQ questions and answers is another great experiment with the rise of AI search taking informational search traffic away, it’s a good way to encourage LLM Citations and AI Overview visibility.
Simply analyse People also ask questions in google search, complete keyword research, check competitors and ask LLMS for related terms to gather new questions. Then create answers for them. Bellow is our People also ask and Rich Snippet Optimiser that could help with this.

This can be done easily using AI Content Generator or even our Template Content Builder If you have data for additional questions and would like to create a default templates. Our toolset covers these tools, check out our full SEO tool and AI SEO-Content tool range.
HTML Element Changes
Changing HTML elements can have a huge impact on traffic if the structure on your website is already there for the changes. For example, if you have lots of headings but they are not all marked up as headings in the HTML. Headings give Google more context on page content, so adding additional headings or changing headings can be an interesting experiment.
For example, let’s look at Sky Scanners FAQS on flight pages. They are currently h3 headings. Changing these to H2 headings would be a great test!

New Features/Structure
New features are a great on-page test as they can add to the user experience and enhance rich snippet optimisation. Features or structure changes could be anything from snippet answers and lists to tables.
- Summary Tables
- Pricing Tables
- Popular X Lists
- Character restricted sentences/ Snippet friendly answers
There are 2 ways to do SEO testing
- SEO Statistical Split Testing
- Before and After (A simpler version of the above without a significance/confidence factor.) – This is also used to validate Statistical Tests.
SEO Split Testing
You might have heard of some industry SEO split testing tools before like Search Pilot or Split Signal, but these tools come with huge costs, anywhere between 15-100k a year. Our Free SEO Testing tool uses the same method and it’s FREE! All you need is access to your page and day level google search console data in order to use it. So how does it work?
SEO Test and Control with stratified sampling
Pick a similar group of pages/URLS for your test. Ensure the full group has at least 80-100k of clicks over the last 100 days. If not and you have a smaller site/traffic just use what you have. After you are happy with your URL group, the group needs to be split into test and control groups using stratified sampling.
Once you have your test and control group you can launch the changes on your test group. Tests should run for at least 21 days at a minimum, however around 30 days is a good amount of time.
Analysis of Traffic: Bootstrap Causal impact interference and Z-Test
Once the test has finished, use a bootstrap causal impact inference analysis and a z-test to estimate the impact of the test changes.
The bootstrap causal impact model combines bootstrap resampling with causal inference to estimate the effect of a treatment (or intervention) on a specific outcome. In this case, it forecasts the expected run rate (what would have happened without the change) based on the control group/ historical data and then compares this forecast to the actual traffic observed in the test group during the experiment. It then provides a confidence level, absolute and relative uplifts. The Z test then formally observes if the uplift in the test group was a significant traffic change or is not significant.
Here is an example from our Free SEO testing tool below.

The Before and after SEO Test Process
Validating Positive Tests
If you have a positively significant test, you now need to validate it, to see where position improved and generated traffic. The point of validation is so you know it wasn’t demand impacting the test.
To validate a test, pull query and page level Search Console data for the test period vs the previous period. Also pull the same for the page level data.
Summarise the total traffic, impressions and position difference at page level. You are unlikely to see uplift here as page level has a lot of noise in the data from longer tail queries. However, search console extracts differ and page level will show you around 80% of total traffic in a sampled view, so it’s the best view of clicks and impressions.
To validate position there is not a straightforward process, it requires cutting and slicing the query, page level data in different ways to find the positional uplift. It depends also on what the test is.
For example, for the top 3 query test you may calculate or pull the top 6 queries per test group page and then aggregate/pivot this at a Top query view to look at positional uplifts in 1st queries, 2nd and 3rd etc.
For a long tail test like FAQ content, you may look at uplifts for types of queries like time, price, duration and even the top 2 queries if the positional uplift was driven by relevancy rather than long tail/query count.
More than often the uplift is seen in the top 2-3 terms, unless it’s a very specific long tail test. You may also want to look at query count change in the top 6 positions per page. The important thing is you find where most of the benefit came from. For example, you could say we saw a 16% positional uplift in the top 3 queries and this was 75% of traffic for the test group.
Once you have the method and process sorted, it’s all about curiosity! Test what you know and test what you don’t know. Not all best practices are critical. *cough* Title tag character count *cough*.