We use cookies to personalise content and ads, to provide social media features and to analyse our traffic. We also share information about your use of our site with our social media, advertising and analytics partners who may combine it with other information that you’ve provided to them or that they’ve collected from your use of their services.

Most promotion tests fail before they start

By
Dan Bond
March 23, 2026
5 mins

Most promotion tests fail because key issues are not addressed before the test begins.

Not during it.

The core issue is the quick leap from noticing a metric drop to making recommendations.

Someone spots a conversion rate drop. Or lower engagement on a product page. Or more cart abandonment than usual.

And the recommendations start flowing:

  • Test 10% off vs 15% off
  • Try free shipping over £50
  • Add a countdown timer
  • Test a bundle offer
  • Show the discount earlier

Maybe one of those will solve the issue.

But in most cases, they won’t.

Because seeing a problem is not the same as understanding it.

50% to 80% of test results are inconclusive, depending on the vertical and stage of the testing program. That's not bad luck. That's skipping steps.

The missing piece is insight: Understanding the root cause.

The insight comes first

Before you recommend any promotional test, you need to define:

  • What exactly is happening?
  • Where is it happening?
  • Who is it affecting?
  • Is it actually a valid problem?
  • Can a promotion realistically solve it?
  • Will fixing it create real commercial value?

If you skip this, you get weak hypotheses, wasted tests, and minimal results.

There are dozens of ways to tackle the same surface-level problem. But if you do not properly document the insight first, the test is usually just a shot in the dark.

Analysis of more than 28,000 A/B tests found that only about 14–20% of experiments meet the 95% confidence threshold (from PrepVector, a newsletter on experimentation). Most tests do not deliver a clean uplift. They fall into neutral, contradictory, or inconclusive territory.

And when that happens repeatedly, stakeholders lose faith in promotional testing altogether.

This is not a new problem

30% to 40% of retail promotions are either inefficient or unprofitable, according to BCG. The same logic applies to promotion tests. Most of them shouldn't be running in the first place.

The reason?

Most start with a discount idea rather than the behavior that needs to change.

“BCG found that personalized marketing can lead to a 5-8x return on investment for CPG companies when compared to blanket marketing campaigns.”

The Future of Commerce

If you start from the right insight, your test ROI increases.

Better insights drive better tests and better outcomes.

What does a properly defined insight look like?

Let's say you notice cart abandonment is higher than usual.

A weak insight would be: "Cart abandonment is up. Let's test a discount."

A strong insight would be:

"Mobile cart abandonment in the footwear category has increased 18% month-on-month, affecting 42% of mobile traffic. Exit-intent data shows that 63% of abandoners leave at the shipping cost reveal. Support tickets for 'unexpected delivery charges' are up 29% in the same period. If we address this, we're looking at an estimated £55k monthly revenue opportunity."

The strong version gives you:

  • What is happening (mobile cart abandonment up 18%)
  • Where (footwear category, at shipping reveal)
  • Who (mobile users, 42% of traffic)
  • Supporting data (exit intent patterns, support ticket trends)
  • Commercial impact (£55k/month)

Now you can build a proper test.

And more importantly, you can test the right thing.

Instead of "10% off vs 15% off", you might test "free shipping threshold vs expedited delivery discount" or "upfront shipping cost display vs post-basket reveal".

Those are fundamentally different tests. And they only become obvious once you define the insight properly.

Prioritisation only works if you define the insight properly

This is where prioritisation frameworks help.

We've written before about ICE scoring (Impact, Confidence, Ease) and PIE scoring (Potential, Importance, Ease). Each uses different criteria to rank experiments and help decide which deserve attention first.

PIE looks at:

  • Potential – how much improvement can be made
  • Importance – how valuable is the traffic
  • Ease – how complicated will the test be to implement

ICE is similar, but swaps Importance for Confidence:

  • Impact – expected effect on the metric
  • Confidence – how strong is the evidence
  • Ease – how easy is it to build

Both frameworks help you rank tests by expected value and feasibility.

But they only work if you define the insight properly beforehand.

Otherwise, you're just scoring guesses.

If your insight is "cart abandonment is up, let's test a discount", you might score it:

  • Potential: 7 (discounts usually work)
  • Importance: 8 (cart is high-value traffic)
  • Ease: 9 (easy to implement)
  • Total: 24

But if your insight is "mobile users abandon at shipping reveal because costs are higher than expected", you might test something completely different and score it:

  • Potential: 9 (directly addresses the exit trigger)
  • Importance: 8 (same high-value traffic)
  • Ease: 6 (requires checkout flow changes)
  • Total: 23

Similar scores. But the second test targets the actual problem and is more likely to succeed.

The first test might temporarily reduce abandonment. The second test might eliminate the root cause.

The cascade effect of bad insights

When you skip the insight step, everything downstream breaks.

You score tests based on gut feel. You build variations that miss the point. You run promotional tests that shouldn't be tested.

And when those tests come back inconclusive, you move to the next shiny offer idea.

This is why we focus on proper test design. Because even a well-executed test is useless if it's focused on the wrong area.

If you're testing "10% off vs 15% off" when the real problem is timing (people aren't ready to buy yet), or eligibility (the offer excludes their basket), or trust (they don't believe the urgency messaging), then it doesn't matter how clean your test methodology is.

Testing the wrong thing answers the wrong question.

How to build this into your promotional testing process

Here's a simple validation filter you can apply before any promotion test gets added to the roadmap:

The five-question insight check:

  1. What's the observed problem?
  2. What's the supporting data?
  3. Who does it affect?
  4. What's the commercial impact if we fix it?
  5. Can a promotion realistically solve this?

If any answer is unclear or weak, the test doesn't make the backlog.

This forces rigor before creative mode. It stops low-value tests from clogging the pipeline. And it protects test velocity by running only what matters.

Question five is the killer.

"Can a promotion realistically solve this?"

Sometimes the answer is no.

If mobile users are abandoning because your product images are too small, a discount won't fix that. If people are leaving because shipping times are too long, 10% off won't change their minds. If the checkout flow is broken, free delivery won't save the sale.

Promotions help, but they’re not universal fixes.

As we've written before, testing new customer offers without understanding why customers aren't converting in the first place just burns budget.

Start with the insight, then test the mechanic

Stop leaping from observation to recommendation.

Define the insight first.

Then score it.

Then test it.

And when the test runs, you'll know exactly what you're trying to learn, because you defined it before the test started.

This is how you turn tests into real business results.

ecommerce offer copy report cover