The pros and cons of A/B testing

Linear CEO and co-founder Karri Saarinen made waves when he mentioned in an interview that his product teams “don’t do A/B tests.”

He didn’t quite mean that data is absent from their decision-making process — Linear runs beta tests, gathers feedback, and iterates accordingly, just like any product org worth its salt. But when it comes to long-term, strategic planning, OKRs are out and gut feel is king.

Saarinen explains:

We validate ideas and assumptions that are driven by taste and opinions, rather than the other way around where tests drive decisions. There is no specific engagement or other number we look at. We beta test, see the feedback, iterate, and eventually we have the conviction that the change or feature is as good as it can be at this point and we should release it to see how it works at scale.

The direct benefits of A/B testing are abundantly clear. For one thing, it doesn’t cost a lot of time or money to find out whether customers/users prefer A or B. However, while it can make the product process look great on paper, efficient and repeatable processes like A/B testing can end up improving the product in almost every way except actually addressing customer needs.

For PMs, it’s fun to dive neck-deep into conversion rates and find creative ways to save money. But we should also remember that these are business objectives customers don’t care about.

Benefits of A/B testing

To be fair, the practice of A/B testing is still as widespread as ever. By and large, product team rely on A/B tests to help them decide whether to commit to or abandon an idea before they sink resources into it.

“Not A/B testing is like walking somewhere with your eyes closed,” says Shanka S. Dey, director of product management at Salesforce. “You’ll not get too far and you’ll probably be off-track.”

The most obvious benefits of A/B testing are:

A/B testing is definitive
A/B testing is quick
A/B tests can help you pursuade
A/B test results usually come with other insights
A/B testing is flexible
A/B testing lets you ‘just try something’

A/B testing is definitive

There’s little debate about one thing: an A/B test will objectively reveal which version is better. In a fair experiment where only one variable is changed and all other conditions are controlled, you’ll be able to understand what made one option superior to the other.

A possible downside is that you’re left to assume why the result came out as it did (although sometimes it’s obvious, and if you’re doing prototype testing, you can explicitly ask why).

A/B testing is quick

Get the answer; move forward. A/B testing can be lightning-fast, and those insights can help you validate and move on your assumptions before competitors beat you to the punch.

A/B tests can help you persuade

Product managers usually have a wide range of stakeholders to consider and, often, the wider that range, the more difficult it is to institute change.

Chris Bajgier, currently vice president of digital product management at Dollar Bank, recalled an example from his experience at a previous employer in which A/B testing not only revealed new insights about customers’ preferences, but also influenced the business model in a big-picture sense.

When he proposed adding an alternate call-to-action button to open a bank account via a different channel, the idea was met with internal resistance. Several stakeholders were concerned that the availability of a new channel — an appointment scheduling app Chris’ team had recently stood up — would drive users away from the main app.

Industry research about bank product shopping patterns weren’t enough; these stakeholders needed indisputable, data-backed evidence to agree to move forward. So, Chris convinced them to run an A/B test, which showed that the additional CTA had no detrimental impact on account opening sales via the app.

Over 200k developers and product managers use LogRocket to create better digital experiences

Learn more →

“The test gave us a better way to support our future customers and their channel preference, dispassionately settled a turf battle, kept the team and their stakeholders in harmony, and increased the relevance of the new appointment scheduling service in our business model,” Chris recalls. “Without A/B testing, getting to a decision would have invoked politics, fear, and dithering on the approach.”

A/B test results usually come with other insights

This includes heatmaps, analytics, and so on.

Going back to Chris’ example, the A/B test he ran on the bank’s appointment scheduling app helped strengthen attribution of a “digitally influenced” sale to content in the shopping journey, which unlocked a whole new world of optimization opportunities.

Are additional insights always a good thing? It certainly doesn’t hurt to have them, but speaking to customers directly will always provide more clarity.

A/B testing is flexible

It fits into almost any workflow. You can conduct A/B tests alongside prototype testing. This enables you to skip implementation in scenarios where the test doesn’t validate an improvement to implement. Or vice-versa: live apps and websites can also be A/B tested, enabling you to skip prototyping when it isn’t necessary.

Combining prototype testing with A/B testing enables us to ask additional qualitative questions, whereas A/B testing live apps and websites happens behind the scenes (i.e., users are unaware).

This isn’t necessarily an advantage that A/B testing has over other methods of research, but it is a nice benefit.

A/B testing lets you ‘just try something’

Rather than testing based on customer insights, A/B testing enables you to validate ideas based on educated guesses and gut instincts quickly, whereas other methods of research might be slower or less cost-effective.

A/B testing can either save or cost you money

Failing quickly is cheaper than jumping straight to implementation and failing. Most product teams take this approach because it balances cost with risk.

However, jumping straight to implementation and succeeding is cheaper than A/B testing. If you can make the right bets on the right decisions, you might find that you don’t need to “waste” resources on A/B testing features you know you’ll end up implementing anyway.

So, is jumping straight to implementation a good idea or not? Well, that depends on your recent success rate with taking risks, at least in regards to A/B testing. The problem, though, is that determining your success rate is a risk in itself — how many potentially expensive mistakes are you willing to make in the name of comparing predicted performance with objective performance?

The Dunning-Kruger effect states that incompetent people tend to overestimate their abilities whereas competent people tend to underestimate them, so that’s also something to consider: those who skip the research stage are often those who need it the most.

Furthermore, mature organizations are often better-equipped to perform efficient A/B tests. Newer, less built-out teams may lack the foundation, both from a technology and experience standpoint, to derive much value from A/B testing.

“A/B tests [can be] slow and expensive to run, but that’s a problem of infrastructure and tooling and practice,” says Aampe CEO and co-founder Paul Meinshausen. “In other words, it’s a problem that can be solved. It’s not a problem inherent to the method; it’s an artifact of available technology.”

A/B testing often yields false positives

If you want to test A and B (or, if multivariate testing, perhaps C and D as well), the participant doesn’t get to suggest E (unless you’re conducting prototype tests with qualitative follow-up questions).

For example, if we were to present participants with a blue button (A) and a dark blue button (B) to see which one results in more conversions (Z) and the most effective variant turned out to be B, how would we know if variant E (the option that we hadn’t even considered) would’ve been better? We wouldn’t, and this is why A/B testing becomes less effective when there are more variants to consider. With A/B testing, we’re limited to only the ideas we can think of.

What if participants don’t want A, B, C, D, or E? What if we’re trying to increase the conversions of Z but, in reality, customers want Y?

I use the term false positives because A/B testing always leads to improvement but doesn’t always lead to meaningful change. Given that, it’s easy to understand why some product teams such as Linear avoid A/B testing.

Is A/B testing more trouble than it’s worth?

So, should you avoid A/B testing altogether? No, but there is a right and wrong time to use it.

“Business and product decisions are made within and across multiple levels of abstraction,” says Meinshausen. “The highest levels of abstraction tend to be thought of as strategic. ‘Should I develop this feature?’ is a more strategic question; it’s more abstract. ‘What should the header on the homepage for this feature be?’ is a more tactical question; it’s less abstract.

“Efforts like A/B tests that are lower down the abstraction ladder can be a way of more rapidly and persuasively building support at the strategic (upper) level.”

Furthermore, A/B testing isn’t suitable for solving conscious problems that customers are self-aware of — for those problems, you should use qualitative research methods that involve talking to customers.

A/B testing is more suitable for unconscious problems such as the color of a button — things that customers wouldn’t necessarily think to point out — but only after the big problems have already been figured out. Again, while A/B testing doesn’t necessarily reveal the best solution, it reveals the best solution of those presented.

The main thing to keep in mind is that “this or that?” is sometimes the wrong question to ask.

Key takeaways

There are two key takeaways here:

First, A/B testing is useful and shouldn’t be avoided as a general rule. However, it should only be used as part of a balanced research strategy.

“A/B testing is not a panacea and has limitations,” says Dey. “Experiments with low coverage require a lot of traffic or time to reach statistical significance. Some experiments may have long-term effects that are hard to capture through one test, or involve ethical or legal considerations before launching. Results might face skepticism from stakeholders who do not trust it.

“These can be mitigated by following the best practices of trustworthy A/B experiments: setting a clear OEC, using proper statistical analysis, documenting and sharing learnings, and sustained change management to build an experimentation culture.”

On that note, you shouldn’t dismiss product design frameworks either. That said, you should be allowed and willing to go off-script when you feel the urge (while being mindful of any biases of course). To clarify, this includes A/B testing — designers, engineers, and anyone, really, should be able to test ideas that they feel strongly about. Product teams might not be able to truly understand the needs of customers without talking to them, but they’re still made up of smart and experienced people.

All-in-all, I believe this is a more balanced outlook on A/B testing that leverages its benefits but addresses the concerns Saarinen outlined in his interview, particularly in regards to product teams optimizing for efficiency rather than creativity, and designers and engineers delegating critical thinking to product managers.

Product design has definitely become too methodic. Designers and engineers (more so designers) need some degree of autonomy because, ultimately, they’re the boots on the ground; they’re the ones closest to the problems that need to be solved. If A/B testing is used in the right circumstances and it works, then why not? But everything can’t be about cutting costs and business value all the time.

Featured image source: IconScout

LogRocket generates product insights that lead to meaningful action

LogRocket identifies friction points in the user experience so you can make informed decisions about product and design changes that must happen to hit your goals.

With LogRocket, you can understand the scope of the issues affecting your product and prioritize the changes that need to be made. LogRocket simplifies workflows by allowing Engineering, Product, UX, and Design teams to work from the same data as you, eliminating any confusion about what needs to be done.

Get your teams on the same page — try LogRocket today.