CRO Analytics: Diagnose the Problem Before You A/B Test Anything

The Backwards Approach That's Draining Your Budget

Your conversion rate is bad. You know this because someone in your company said it should be higher. So what's your next move?

Most teams I talk to skip straight to solutions. They run an A/B test on the button color. Change the form from five fields to three. Add a countdown timer. Increase urgency. Reduce friction. Implement urgency. These are all valid tactics, individually. But when you're testing them without knowing why your conversion rate is low, you're gambling with your time and budget.

Here's what actually happens: A team optimizes the checkout page when 75% of users never reach it. They polish the signup form when the real problem is that 80% of visitors are arriving with zero intent. They run 47 A/B tests on a landing page when the page itself is fundamentally misaligned with the traffic source.

This is CRO done backwards.

The ROI math should scare you into a different approach. Optimizing your conversion rate delivers 60x better ROI than driving more traffic to an unoptimized funnel. You can't acquire your way out of a broken funnel. Every dollar spent on traffic acquisition to a broken conversion path is money wasted. But the opposite is also true: fix the funnel first, then scale traffic.

The right way to think about CRO starts with diagnosis, not optimization. Before you test anything, you need to know:

Where are people actually dropping off?
Why are they dropping off at that specific point?
What is the highest-leverage intervention you could make?

Only after you've answered these questions does it make sense to form a hypothesis and run a test. This guide walks you through that process using GA4, practical analytics tools, and a structured diagnostic framework.

Step 1: Where Are You Losing People?

You can't fix what you don't measure. The first step is mapping your conversion funnel and identifying the biggest drop-off points.

Start with GA4's Funnel Exploration tool. If you're not familiar with it yet: Funnel Exploration in GA4 lets you define a sequence of events (landing on a page, clicking a CTA, signing up, completing a purchase) and shows you exactly where users fall out of that sequence. You get both the absolute number of users at each step and the percentage who progress to the next step.

Here's what that looks like in practice:

Define your conversion funnel. For an SaaS product, this might be: Visit pricing page → Click "Start free trial" → Complete signup form → Verify email → Activate account.
Run the Funnel Exploration.
Look for the step with the biggest drop-off percentage.

The biggest drop-off is your signal. This is where volume is leaking out of your funnel. If 1,000 users hit your pricing page and only 100 click the CTA button, you have a 90% drop-off. That's your diagnosis point.

But here's the critical distinction: finding the biggest drop-off isn't the same as understanding why it's happening. A 90% drop-off on a button click could mean:

The button isn't visible (technical issue)
Users don't understand what happens when they click (clarity issue)
Users landed on the wrong page (traffic/targeting issue)
Users lost interest before reaching the button (engagement issue)
The offer itself isn't compelling (product-market fit issue)

All of these are "different problems wearing the same mask" — a high drop-off rate. Before you test a new button color or rewrite your copy, you need to understand which of these is actually happening.

The Funnel Exploration gives you the "what." Steps 2 and 3 are about understanding the "why."

Step 2: Understanding Why People Drop Off

Different tools reveal different kinds of problems. Think of this step as pattern matching. You're looking for evidence of the actual reason people aren't converting.

GA4 for quantitative patterns. Your first instinct should be to segment the funnel by relevant dimensions. Pull up your Funnel Exploration again and add breakdowns:

By device type (mobile vs. desktop vs. tablet). Mobile conversion rates are often 40-50% lower than desktop. If your drop-off is concentrated on mobile, the problem might be a broken responsive design, slow load times, or a form that's painful to fill on a phone.
By traffic source (organic, paid, social, direct). If users coming from paid ads convert at 2% and users coming from organic search convert at 8%, that's a signal that your paid traffic is misaligned with your offer. You're showing ads to the wrong people.
By new vs. returning users. A high drop-off among new users might indicate unclear messaging or poor user education. High drop-off among returning users might mean they're coming back for a different reason than what you're optimizing for.
By geography or language. International users might be dropping off because of currency confusion, payment method unavailability, or localization gaps.

This segmentation often reveals that your "conversion rate problem" is actually a "wrong traffic problem" or "mobile experience problem" — which changes everything about what you should test.

Session recordings to see what users are actually doing. GA4 tells you that users aren't clicking the button. Session recordings (Hotjar, Microsoft Clarity) show you why. Are they hovering over the button and hesitating? Are they looking for information elsewhere on the page? Are they scrolling back up? Are they on the page for five seconds before leaving?

Session recordings are your anthropologist tool. You're looking for behavioral patterns: Do users who convert scroll to a certain point before clicking? Do users who abandon the page look at pricing information? Do they get stuck on a particular form field?

A critical piece of session recording analysis: watch the users who don't convert. This is counterintuitive — we often want to celebrate the winners, not study the failures. But the drop-outs are the ones telling you what's broken.

Scroll depth tracking. If your drop-off happens after users land on a page, check whether they're scrolling to see your key content. It's possible your CTA is "below the fold" and the audience simply isn't scrolling to it. This is less common on modern web than it used to be, but it still happens. Use GA4 events to track scroll depth or use a tool like Clarity to see heatmaps of where users are engaging.

User surveys. After you've collected quantitative signals (funnel data, traffic source breakdown, session recordings), ask users directly. A simple survey can cut through speculation: "You visited our pricing page but didn't sign up. Why?" The answer might surprise you. Maybe the pricing is genuinely too high. Maybe they were comparing you to a competitor. Maybe they weren't the decision-maker. Maybe they didn't realize it was a free trial. Each reason requires a different intervention.

Surveys work best when targeted — ask users who hit a specific drop-off point, not everyone. "People who viewed pricing but didn't start a trial" is a more useful segment to survey than "people who left the site."

Taken together, these tools paint a picture. You stop guessing and start diagnosing.

Step 3: Forming a Testable Hypothesis

After diagnosis comes hypothesis formation. A good hypothesis has a specific structure:

"Because [problem identified in step 2], [specific user segment] [don't complete action]. If we [change], we expect [outcome] because [reason]."

Here are some real examples:

"Because mobile users can't easily fill the form on their phone, new users on mobile devices aren't completing signup. If we simplify the form from five fields to two required fields, we expect mobile signup completion rate to increase by 15% because reducing friction is the primary barrier."
"Because users landing from paid social don't understand what the product does, paid social traffic converts at 2% vs. 6% from organic search. If we redesign the landing page to lead with use cases relevant to paid social audiences, we expect paid social conversion to increase to 4% because better alignment reduces confusion."
"Because the checkout flow requires creating an account before purchase, users with low cart value are abandoning checkout. If we add a guest checkout option, we expect checkout abandonment rate to decrease by 8% because we're removing friction for price-sensitive users."

Notice what these hypotheses have in common: they're specific, they identify a root cause, and they predict a measurable outcome. They're not "test the button color" or "make the CTA more prominent."

Vague hypotheses produce vague results. If you test "make the page more persuasive" without specifying what change you're making, you won't learn anything even if the test succeeds. The whole point of CRO testing is to identify what actually moves the needle.

A well-formed hypothesis also forces you to think about directionality. You're not testing 47 variations at once. You're testing a single, specific change to validate a single theory. This is what makes your results conclusive instead of noise.

Step 4: A/B Testing Fundamentals

Now you're ready to test. But before you launch, understand the mechanics of a valid A/B test.

Statistical significance is non-negotiable. An A/B test only tells you something is true if you've collected enough data. If you run a test for three days and variant B converts 2% better than variant A, you don't know if variant B is actually better or if you just got lucky. This is where sample size and test duration come in.

The basic formula: you need enough sessions and enough conversion events for your result to be statistically significant at the 95% confidence level. Most tests require between 1,000 and 10,000 sessions per variant depending on your baseline conversion rate and the effect size you're trying to detect. For ecommerce with a 2% conversion rate, you might need 5,000 sessions per variant to detect a meaningful change. For a landing page with a 20% conversion rate, fewer sessions suffice.

Use a sample size calculator before you launch. Tools like Optimizely's or VWO's calculators let you input your baseline conversion rate, the improvement you expect to detect, and they tell you how many sessions you need. Most tests need to run 2-4 weeks, not days.

Minimum detectable effect (MDE). In that calculator, you're also specifying what improvement is worth testing for. If your baseline conversion rate is 2%, is a 2.1% conversion rate (5% relative improvement) worth validating? Maybe not. The business impact is tiny. But is a 2.4% conversion rate (20% relative improvement) worth it? Absolutely. You're setting the MDE based on what would actually matter to your business.

Common mistakes that invalidate tests:

Stopping a test early because one variant is winning. Your early data is noise. Let it run to statistical significance.
Running multiple tests on the same traffic simultaneously. If you run 10 tests at once, your per-test sample size is split across all variants, and none of them reach significance. Also, you increase your false positive rate — at least one "test" will appear to win by pure chance.
Ignoring seasonality. Run a test for a full week that includes the same day-of-week pattern. Don't run a Monday-Friday test — users behave differently on weekends. Better yet, run tests for 2-4 weeks to average out day-of-week patterns.
Changing the test while it's running. You landed on the headline "Start your free trial" but halfway through you change it to "Get started free" because you think it's better. Now your test is measuring the difference between both headlines, not the original change, and your results are compromised.
Testing directionally opposite changes simultaneously without a control group. If you test "urgent language" vs. "calm, analytical language" against each other, you might not know which one is actually better than your original. Always include the original (control) in the test design.

Step 5: Measuring Test Results Correctly in GA4

Your test is over. You've run it long enough, you have statistical significance, and the results are clear: variant B won. Now what?

GA4 makes measuring test variants more complex than it should be. Google deprecated Google Optimize in September 2023, meaning you can't use it anymore if you were already using it. The sunsetting was announced officially in the Google Analytics help center. This forced teams to find alternatives.

Today, most teams set up A/B test measurement in GA4 by:

Creating a custom event or user property that tracks which variant each user is assigned to. If you're running a test on your website, your testing tool (Optimizely, VWO, Unbounce) will track the variant assignment. You pass this information to GA4 as a custom dimension or event parameter.
Setting up a conversion event that captures the outcome you care about. If your hypothesis is about form completion, create a specific "form_completed" event. Don't just rely on page views.
Running an Exploration in GA4 that segments your conversion event by the variant dimension. This shows you conversion rate for variant A, conversion rate for variant B, and the difference between them.

This setup is more work than Google Optimize was, but it also forces you to be more intentional about what you're measuring.

Beyond conversion rate, look at:

Revenue per visitor. Your variant B might have a 2% conversion rate vs. variant A's 1.8%. That looks like a win. But if variant B customers buy lower-priced products on average, the revenue per visitor might actually be lower. Don't optimize for conversion rate in isolation.
Return on ad spend (ROAS) if you're running paid traffic. A test that improves conversion rate but attracts lower-quality customers might actually reduce ROAS.
Engagement metrics downstream. If your test improves signup rate but signup quality is worse (lower retention, lower LTV), the test might not be a business win even if the immediate conversion metric improved.

The point: measure the outcome you actually care about, not just the metric that's most visible.

The Conversion Rate Benchmarks (And Why They're Mostly Useless)

You've probably seen this: the "average SaaS landing page converts at 6%" or "ecommerce average is 2.5%." These benchmarks exist. They're real. And they're almost completely useless for your business.

Here's why: Baymard Institute research shows that shopping cart abandonment rates across ecommerce average around 70%. But this includes everyone — high-end luxury brands, fast-moving consumer goods, subscription services, marketplaces. If you sell $10,000 software contracts to enterprises, your "conversion rate" might be 0.5% and you're still massively profitable. If you sell $9 digital downloads, you need 15%+ to make the numbers work. The benchmark tells you nothing about whether your specific conversion rate is good.

The same applies to SaaS landing pages. Unbounce's benchmarks show landing page conversion rates varying from 2% to 5% depending on industry. But that range is so wide it's almost useless. Your benchmark is your own baseline over time.

What matters: are your conversion rates improving month-over-month? Are they improving for specific traffic sources? Are they improving for specific cohorts?

Track your own baseline, run your diagnostic-first process, and measure improvement against yourself, not against an arbitrary industry average.

High-Leverage CRO Opportunities Most Teams Miss

Most teams are optimizing the wrong things. Here are the drop-offs that matter most, based on actual data and actual impact:

Mobile conversion rate gap. This is the number one missed opportunity. Mobile users often convert at 40-60% the rate of desktop users. This isn't because mobile users are uninterested — it's usually because mobile experiences are broken. Forms are painful. Checkout flows are slow. Text is too small. Images take forever to load. If your mobile conversion rate is half your desktop rate, this is your highest-leverage diagnostic point. Segment your GA4 funnel by device type first, and if mobile is the bottleneck, fix mobile before you test anything else.

Form abandonment and field friction. Baymard Institute research consistently shows that form fields are the primary reason users abandon checkout. Every additional field you require increases abandonment by 1-2%. If your form asks for a phone number, password, and company size on signup, test removing the less critical fields. This is almost always a win.

Checkout flow complexity. Related to forms: multi-step checkout flows often perform worse than single-page checkouts, even though single-page looks "cluttered." The reason: every page load is a friction point and a chance for users to abandon. If your checkout is five separate pages, test consolidating to two or three.

Pricing page clarity. For SaaS, the pricing page is where consideration becomes commitment. If users leave your pricing page without signing up, do they understand what each tier includes? Can they compare plans easily? Do they see any mention of implementation, onboarding, or support? These details drive conversion. Session recordings of pricing page visitors often reveal that users are trying to figure out what they're actually paying for.

Value prop clarity for paid traffic. If organic users convert at 8% and paid users at 2%, the problem is usually alignment. Paid users didn't know what your product was before clicking the ad. They're arriving confused. This is a traffic and messaging problem, not a landing page problem. Fix the ad copy or targeting before you redesign the landing page.

Step 6: How to Use GA4 for Continuous CRO

Good CRO isn't a one-time project. It's a process. Build a monthly cadence:

Monthly funnel review. Run your Funnel Exploration with a monthly view of the last 90 days. Identify any new drop-off points that have emerged. Sometimes a new drop-off signals a technical issue (a form went down, a payment gateway is having issues). Sometimes it signals a shift in traffic mix (you acquired new, lower-intent users from a new channel).

Segment analysis by every relevant dimension. You're not looking for one number — the "conversion rate." You're looking for variation: which segments convert well? Which segments convert poorly? That poor-converting segment is your next diagnosis target.

New users vs. returning users (returning should be higher)
Device type (mobile, desktop, tablet)
Traffic source (organic, paid, social, direct)
Geography
User acquisition cohort (users who signed up in January vs. February might have different intent or product expectations)

Use GA4's Exploration tab to set up a repeatable report that you run every month. That report becomes your diagnostic dashboard.

Track custom events properly. GA4 requires you to be intentional about conversion events. Don't rely on page views. Create events for:

Form completion (if you have multiple forms, track each separately)
Button clicks (CTA click, download click, booking click)
Purchase completion
Account activation

Each of these events is a conversion point in your funnel. Tracking them separately lets you identify where users drop off with precision.

Set up user properties that reflect business intent. If you know a user is an enterprise prospect, a SMB, a free trial user, or a long-term customer, create user properties that capture this. Then segment your conversion funnels by these properties. Enterprise prospects might need a phone call to convert, not a self-serve funnel. SMBs need fast, frictionless signup. Optimizing for both with the same funnel is a waste.

How Emilytics Accelerates CRO Diagnostics

Running the diagnostic process above — mapping funnels, segmenting by dimensions, identifying drop-off patterns — typically requires you to build 5-10 different GA4 Explorations, each asking a slightly different question. You might build an Exploration to check mobile funnel performance, another to compare new vs. returning users, another to segment by traffic source.

This is tedious, and it's the reason many teams skip the diagnosis step and jump straight to testing.

Emilytics automates this part. Instead of manually building Explorations, you can ask natural language questions: "Where are we losing mobile users?" or "Which traffic source has the lowest conversion rate?" Emilytics translates these questions into GA4 queries, runs them, and surfaces the insights.

This doesn't replace your judgment — you still need to form hypotheses and run tests. But it removes the friction that prevents most teams from doing the diagnostic work in the first place. The data is there. The right tool makes it accessible without SQL knowledge or GA4 Exploration expertise.

Frequently Asked Questions

How long should I let a test run?

As long as it takes to reach statistical significance, which is typically 2-4 weeks depending on your traffic volume and baseline conversion rate. Use a sample size calculator before launching. If you have low traffic (under 1,000 visitors per week), expect longer test durations. Don't end tests early even if one variant is winning — that's likely noise.

Can I run multiple A/B tests at once?

Technically yes, but your results will be weaker. Each test dilutes your sample size for every variant. If you run 5 tests simultaneously and only have 10,000 visitors per week, each test has only 2,000 visitors per variant, which might not reach significance. If you must run multiple tests, run them on different audience segments so they don't interfere.

What's a "good" conversion rate for my industry?

There's no universal answer, which is why benchmarks are misleading. Instead, compare yourself to your own baseline. Are you improving month-over-month? Focus on directional improvement, not absolute numbers. If your conversion rate was 1.5% in January and 1.8% in April, that's a 20% improvement — exactly the kind of metric that matters to your business.

Should I test one change at a time or multiple changes together?

One change at a time. Multivariate testing (testing multiple changes simultaneously) requires exponentially more traffic to reach significance and makes it harder to understand which change actually drove the result. Start with single-variable tests. Once you've validated a change, you can combine it with other validated changes in a new test.

How do I know if a test result is actually due to my change or just random variation?

That's what statistical significance is for. If your test is significant at the 95% confidence level, there's only a 5% chance the result is random. Use GA4's built-in significance calculator or a standalone tool. If you see a 1% conversion rate difference without reaching significance, treat it as too noisy to trust.

About the Author

Emily Redmond is a Data Analyst at Emilytics, where she helps companies translate GA4 data into actionable insights. She spends most of her time diagnosing broken conversion funnels and showing teams why they should optimize before they acquire. Previously, she led analytics for a SaaS company and ran over 200 A/B tests — most of which were wrong until she learned to diagnose first. When she's not analyzing funnels, she's thinking about why most CRO is backwards and how data can fix it. You can find her on LinkedIn sharing strong opinions about conversion rate methodology.