Search
Contact salesTry for free

Data discrepancies explained: causes, how to identify and fix

Uliana Lesiv

Uliana Lesiv

Author
Published
Apr 29, 2026
i

Key takeaways:

  • Data discrepancies are normal. You need to focus on understanding the cause, not achieving perfect alignment between platforms.
  • Attribution models, sessions, and conversion definitions differ across GA4, CRM, and ad platforms by design.
  • Consent settings (especially in EU traffic) can significantly reduce, or model reported conversions.
  • Ad blockers and browser privacy restrictions can remove a portion of tracking data.
  • Offline and backend transactions often create “hidden” data that frontend analytics cannot capture.
  • Technical issues like GTM misconfigurations or missing event IDs can lead to duplicates or data loss.
  • Around 10% discrepancy is usually normal if the reason is clearly identified and explained.
  • Use log-based debugging (like Stape Logs) to move from “we see a gap” to “we know exactly why it happens and whether we need to fix it.”

How to spot a data discrepancy?

The first signal of data discrepancy (e.g., Google Analytics discrepancy) is a mismatch between your source of truth (e.g., Shopify, WordPress, backend) and your analytics tool (GA4).

In one Shopify + GA4 case of our client, a store captured about 95% of orders in GA4 compared to Shopify. In another client's case with WordPress setup, the data discrepancy reached ~30% missing purchase events in GA4.

So, the basic comparison of conversions recorded on different platforms has helped our client to spot a data discrepancy.

Data discrepancy causes

Different measurement and attribution models

If you are a data tracking specialist, you must face the questions from product and marketing teams: 

Why is the number of conversions different in CRM, GA4, and Google Ads?

If the difference isn't huge, you've probably answered: "Because platforms don’t measure performance the same way, and have different attribution models", and you could be right.

Each platform has its own logic for when a conversion is counted and how a session or user is counted. The platforms also have different attribution models (the logic, a platform decides who gets credit for a conversion).

Attribution models

If you set the same attribution model across platforms, you still can see the difference in conversions. It happens because platforms usually offer similar naming for models ("last click" or "data-driven"), but the way they calculate and attribute transactions differ under the hood:

  • Each platform sees a different part of the user journey (e.g., GA4 sees only website interactions, CRM counts final transactions only, ad platforms see only their own ad ecosystem).
  • Different calculations/definitions of "click" and "session" (e.g., in GA4, the last non-direct click is used by default, in Google Ads, the last Google Ads click, in Meta, the last eligible ad interaction within its window).
  • Attribution windows can’t be matched completely (some platforms allow flexible windows, while others have fixed or limited options).

Data processing delays

Different platforms process and update data at different speeds, which can create temporary mismatches in your reports. For example, in GA4, you should always pay attention to whether your data is sampled or not. Analyze only unsampled data.

GA4

Some tools offer real-time or near-real-time reporting, but most web analytics platforms rely on processing pipelines that take time. This means that if you compare data too early, discrepancies are almost guaranteed.

So, when to check the data, so it isn't too early? If you’re analyzing data from yesterday, you’re often analyzing incomplete data. Use and analyze data that’s at least 2-3 days old.

Hidden data sources

Hidden data sources are usually transactions from physical stores. They can create data discrepancies that are easy to misinterpret if you’re only looking at frontend tracking.

The transactions that come from offline touchpoints are recorded in your backend system (CRM, CMS), but never pass through your website tracking if you don't have offline conversion tracking configured.

When some data comes from frontend tracking (browser) and some from backend/webhooks, you’re effectively combining different sources of data.

If you or your client has an offline store besides an online one, it can be a case. To start resolving data discrepancies, you need to configure webhooks that send data from your backend to GTM. There are numerous ways to do it; the easiest one is using Stape Conversion Tracking apps for CRM/CMS platforms.

Among the most popular platforms we support are Shopify, WordPress, HubSpot, HighLevel. We constantly add new apps; for the up-to-date list, please refer to the page with CRM/CMS apps.

Ad blockers and browser restrictions impact

Even with a technically correct setup, some share of your data can be lost due to ad blockers and browser restrictions. That was a case in multiple Stape clients - tracking worked perfectly in debugging mode, but still showed gaps in real reports.

In the case with WordPress and GA4, mentioned above, a similar pattern appeared: setup validated in debugging tools, but a portion of real user data never reached GA4.

Successful debugging does not guarantee complete data collection.

Website visitors rely on ad blockers and built-in tracking limitations (like Safari and Firefox have). Some use AI browsers, which in most cases have default configurations to block all the ads and tracking scripts. In such a case, GA4 scripts entirely prevent requests from being sent.

In the case of our client, suggested approaches that helped to reduce data discrepancy were the following:

  • Using a Custom Loader power-up. It helps track script loads in a way that is difficult to detect for ad blockers.
  • Configuring a custom domain. In this case, the tracking requests are served from your main domain instead of third-party ones (like Tag Manager or GA4), which are easy to detect by ad blockers.
Consent-related causes

It will be especially relevant if you target EU countries and/or some states in the USA, since they have adopted strict data privacy laws.

If users do not grant consent for tracking, analytics tools are either limited or completely blocked from collecting data. This impacts the number of users, sessions, and conversions you see in your reports.

Consent is tricky in general; here are some possible ways consent can impact your data collection:

  • You see data differences across platforms because users didn't consent to data collection. In this case, it is totally okay and means you are following data regulations.
  • You get partial consent from the users, and instead of conversion recording, GA4 receives "pings". It can be relevant if you are using Google Consent Mode (v2). When Consent Mode is enabled, and a user does not grant full permissions, GA4 does not stop collecting data entirely. Instead, it sends “ping" requests that don’t use cookies and don’t contain identifiable user data. Later, GA4 uses these signals as part of its modeling system, combining them with aggregated behavioral patterns.
  • You have a Shopify website, and it doesn't transfer a consent status on the checkout page. Shopify’s checkout works in a sandboxed environment; it means it does not automatically inherit consent decisions made on the storefront. Without additional configuration, the default state in checkout is often stated as "denied".
  • You've configured the setup incorrectly. If tags are not properly aligned with consent settings, the initial page view may fail to fire when a user first lands on the site and accepts consent. Instead, tracking only starts on subsequent page views. As a result, the original landing page is not recorded, the initial traffic source is lost, and attribution in GA4 becomes inaccurate (often defaulting to "Unassigned" in GA4 reports).

We show in detail how to troubleshoot the problems with Stape Logs in the following paragraphs.

Events duplication

Event duplication is a common problem, especially for platforms like Meta. For example, when you configure Facebook Pixel and Facebook CAPI (as Meta suggests for efficient tracking).

Without setting up a shared event identifier (such as a transaction ID or event ID), platforms cannot recognize that these events represent the same action. 

In the article below, we show how to troubleshoot and fix this problem.

When is a data discrepancy okay?

Not every mismatch between analytics platforms is a problem. In fact, some level of data discrepancy is expected in any tracking setup.

It's nearly impossible to achieve 100% alignment between CRM/CMS systems and GA4 or ad platforms.

In most setups, a gap of around 10% is generally considered normal. But it will still depend on data quality (client-side or server-side tracking), tracking architecture, and whether you target the countries with data regulations, as we explained earlier.

A data discrepancy is usually totally fine if you can clearly trace its cause, for example:

  • GA4 shows fewer conversions because consent is denied for part of the traffic
  • Shopify shows more orders because it includes offline or backend-driven purchases
  • Ad platforms show different numbers due to attribution windows

In such cases, it is not a tracking error but a measurement limitation.

What is the Stape Logs feature, and how it solves the limitations of other logs

Analyzing logs is an effective strategy to figure out the cause of the problem. Analytics dashboards only show the result, not the process of data collection. When you see a data discrepancy (e.g., GA4 vs Shopify, or GA4 vs Meta), you can confirm that something is wrong, but you cannot tell where or why it broke just by looking at reports.

In this case, Stape Logs will be helpful. It is a feature in your Stape account that lets you monitor all incoming and outgoing requests in your server-side container.

❗Note: the feature is available for the Pro or higher plans only.
Stape Logs

The main limitation of many server-side setups is the lack of visibility into outgoing requests. While incoming requests are usually available, outgoing data often requires additional access or support, making it hard to diagnose issues like data discrepancies.

Stape solves this by providing both incoming and outgoing logs in one place (with consent required for outgoing logs due to potential PII). This gives a full view of your data flow and makes it much easier to identify where tracking issues occur.

Stape Logs for data disrepancy debugging

Identify the cause of data discrepancies using Stape Logs and fix the problem step-by-step

In the collapse elements below, we show in detail how to troubleshoot the issue for the most popular causes of data discrepancies:

Final words

Data discrepancies are a normal part of working with analytics, not an exception. Even when your tracking is correctly implemented, differences between GA4, CRM, and other platforms will still appear due to different attribution models, data processing delays, consent restrictions, and incomplete visibility of the user journey. The goal is not to eliminate every mismatch, but to understand what is causing it.

In most cases, a small gap between systems is expected and acceptable as long as it can be explained. Data discrepancies won’t disappear, but the confusion about why data differs can. With Stape Logs, you can see exactly what’s happening behind every conversion. Start by checking your own data flow and finding where the gap really comes from!

author

Uliana Lesiv

Author

Uliana is a Content Manager at Stape, specializing in analytics and integration setups. She breaks down complex tracking concepts into clear insights, helping businesses optimize data collection.

Comments

Try Stape for all things server-side

What’s going on?

Where are we going?

Attention!
This is a stand-up dog zone.