Future Proof Web Analytics with Your Data Warehouse
The data warehouse has long been used by many companies for building core business intelligence reporting, but function-specific analytics have historically been accomplished through 3rd-party SaaS tools.
When it comes to use cases like web and user behavioral analytics, Google Analytics is the overwhelming market leader and provides out-of-the-box reports that are completely plug-and-play: install the script, get reports.
For decades teams were willing to trade control, security and accuracy for convenience, in large part because Google Analytics was free and as long as the data was good enough, it didn’t make sense to spend precious data resources on building something you could already get out of the box.
Times are changing, though.
Google’s recent announcement of a forced upgrade to GA4 (Google Analytics 4) has been a catalyst for many teams to rethink the limitations of Google Analytics’ plug-and-play approach, which remains in GA4.
In a time where data can create a competitive advantage, “good enough” doesn’t cut it and data teams are collaborating with marketing teams to regain full control and visibility.
The most common solution is building on the data warehouse as the single source of truth for core analytics, where raw first-party web behavior data can be combined with other data across the entire business, enabling more accurate and more insightful reporting.
Why companies are leaving Google Analytics
It’s hard to believe, but Google will soon celebrate the 20th anniversary of Google Analytics. Over the past two decades, Google has steered the service to nearly ubiquitous usage, sitting on a whopping 87% market share(1). As we said above, Google Analytics being free has been a big driver of adoption, but Google also helped define standards for traffic categories and attribution.
But users have had a love/hate relationship with the tool for a long time.
With the messy introduction of Google Analytics 4 (GA4) and a looming deadline to migrate off of Universal Analytics (UA), those long-held frustrations are coming to the surface.
It’s worth stepping back to ask why data and marketing teams are ditching a tool with such broad adoption, especially when GA4 seems to be a step forward. While GA4 is a better tool than UA in many ways, it’s still plagued by the same old problems and, what’s worse, it introduces several new ones.
GA4 is designed to create vendor lock-in
GA4 promises more flexibility, most notably the ability to export data from GA directly into BigQuery, Google’s cloud data warehouse. That sounds great until you realize that GA4’s exciting new integrations are only for Google products, and those products have limited free tiers and are designed to make you pay as you scale.
GA4 doesn’t capture all of your data
Because Google Analytics is the most popular analytics tool in the world, it’s the first target for both ad blockers and privacy-conscious browsers. Studies have shown that 40% of internet users leverage some sort of ad blocker (2) and multiple independent tests have shown that Google Analytics can lose 15-30% of traffic (3)!
GA4 still suffers from lack of data fidelity
Along with capturing less data overall, the individual payloads sent by Google Analytics (including GA4) are limited in their detail. The included data is even encoded using proprietary keys, making individual data points more difficult to work with.
GA4 operates in a black box
If capturing less data and detail weren’t bad enough, Google Analytics still processes key data functions behind the scenes in a black box—and GA4 doesn’t solve this problem. UA was notorious for sampling data, and it’s still unclear how much sampling GA4 data will be subject to.
https://twitter.com/zambros_it/status/1600141663627665408
GA4 faces increasing security and regulatory scrutiny
Google Analytics is increasingly coming under legal scrutiny, especially in the European Union. In fact, in many countries that have passed GDPR legislation, Google Analytics is illegal to use out of the box(4).
GA4 still kills site performance
Degraded site performance is one of the most hated aspects of Google Analytics, especially for technical teams focused on site speed and SEO performance. In our own studies, we’ve seen key site performance metrics improve by up to 40% simply by removing the Google Analytics and Google Tag Manager scripts.
https://twitter.com/maxpchadwick/status/1602507624011939840
The benefits of owning your analytics infrastructure
Thankfully, you don’t have to subject your data team, or any other team, to the limitations of Google Analytics. Using a tool like RudderStack for first-party data collection and your warehouse for storing, modeling and serving data, you can build a future-proof analytics stack that will deliver data you can trust and easily scale with your business.
Data has changed, and for those still depending on Google Analytics, this is the perfect moment to modernize your data stack. Your customer data is the most valuable asset your business has, and you need to own all of the data and stop being dependent on tools like GA to host or report it. With a solid CDP you can now track all your customer data and pipe it to a warehouse, and own your data. The businesses that own their data in the future will be the ones who win.
—Dan McGaw, Founder & CEO of McGaw.io, tech stack and analytics expertsAnalytics with infinite optionality
Abstracting your data capture infrastructure away from your data storage and transformation layers, as well as the analytics tools themselves, gives you the ability to modify any individual component of your stack as the analytics needs of your business change.
Do your A/B testing or product teams want to try a new analytics tool? No problem—just point the event stream at a new destination without opening any dev tickets (more on this later).
- Do your analysts need to modify metrics to reflect a new pricing plan? You can move fast by updating the queries running on your warehouse without having to completely overhaul event instrumentation and your entire visualization layer.
- What if your data science team is building a new recommendations model on a fresh data lake partition? You can avoid painful and slow batch jobs by simply adding a new data store as a destination for your existing event stream.
Capture every site visit
Capturing all of your data requires using an analytics-agnostic tool specifically designed to collect raw, first-party data. RudderStack’s SDKs are purpose-built for capturing customer events and can be proxied behind your site or app URL, giving you full data capture.
See the whole picture with rich, configurable, transformable payloads
RudderStack’s out-of-the-box payloads provide far richer data than GA4. They also feature a standardized, open-source JSON schema with objects you can customize with both event properties and user traits that meet the specific needs of your business. If analytics needs change down the line, which they always do, you can use RudderStack’s transformations to rename keys or reshape payloads without having to touch the code in your app or website.
Build confidence with full transparency in your warehouse
When you store and model all of your data in your warehouse, you never have to wonder what GA4 is doing inside of the black box and whether their algorithms are making the right choice for your business.
We were using the free version of Google Analytics. Everything was anonymized, and we couldn’t see what our users were doing on our website or how they were using our product. RudderStack has increased visibility into user behavior and user journeys, given us deeper insight into our funnel and we can run A/B tests that let us customize the user experience.
—Mona Sami, Director of Data Analytics, InfluxDataDeploy and modify flexible data models that match actual behavior
Using tools like dbt, you can build your own models for key use cases identity resolution and sessionization, then modify them as you learn more about how your users interact with your website and apps.
Break free from Google’s compliance chaos
Perhaps most importantly, owning your analytics infrastructure means you can say goodbye to Google’s compliance chaos once and for all, meaning your infosec, legal and marketing teams can rest easy.
Take charge of your analytics infrastructure
The reality is data is at an ever growing pace and data tools need to constantly evolve to adapt to those changes. Now more than ever is it important to own your analytics infrastructure and give data teams full control and flexibility to quickly adapt to the needs of the business without facing security concerns, data quality issues and painful vendor limitations.
Footnotes
(1) https://www.slintel.com/tech/analytics/google-analytics-market-share
(2) https://www.statista.com/statistics/352030/adblockign-usage-usa-age/
(4) https://noyb.eu/sites/default/files/2022-01/E-DSB%20-%20Google%20Analytics_EN_bk.pdf
Eric Dodds
Senior Director of Product Strategy
Sara Mashfej
Developer Relations at RudderStack