Replacing GA4 with RudderStack, dbt, Snowflake, and Hex

Blog Banner

Many businesses use web analytics data to understand how their websites are used and how to make them more effective.

Google Analytics is the most popular web analytics tool in the world, but many teams are now looking for alternatives due to its limitations. The latest version, Google Analytics 4 (GA4), is a more powerful tool than its predecessor, Universal Analytics (UA), but teams are increasingly wary of trusting Google with their data. Moreover, modern use cases and analytics requirements are exposing the fundamental limitations of black-box analytics.

The haphazard rollout of forced migration to GA4 has many data teams opting for a more flexible approach that gives them control of their data: building web analytics with first-party data on their own data warehouses. In this post, we’ll show you how to use RudderStack, dbt, Snowflake, and Hex to replace GA4 and drive more sophisticated analytics across your business.

Why are businesses ditching Google Analytics?

It's worth considering why data and marketing teams are moving away from a widely adopted tool, despite GA4 seeming like a step forward. Though GA4 offers improvements over UA in various aspects, it still faces familiar issues and also introduces new challenges.

Migration challenges and a steep learning curve

Google is deprecating UA and imposing a forced migration deadline for GA4. The rollout has been unclear and migration is not straightforward – GA4 is an evolved product from UA with a  fundamentally different data model and an overhauled interface. This means businesses must invest time and effort to understand and adapt to the new system. Migrating from UA to GA4 requires teams to solve data migration problems, overcome implementation challenges, and reconfigure reports and dashboards.

Vendor lock-in

GA4 offers increased flexibility, primarily through the capability to export data directly from GA into Google's cloud data warehouse, BigQuery.

However, it's important to note that GA4's new integrations are exclusively for Google products. These products are designed to generate revenue as you scale and have limited free tiers. For instance, Looker Studio (previously known as Data Studio) has a query limit, and to access more advanced reporting features, payment is required.

Doesn’t capture all of your data and lacks data fidelity

Because Google Analytics is the most popular analytics tool in the world, it’s the first target for ad blockers and privacy-conscious browsers. As a result, GA4 does not capture all of the visitor data from your website or app. Moreover, the individual payloads sent by Google Analytics (including GA4) are limited in their detail.

Operates in a black box

Google Analytics processes key data functions behind the scenes in a black box — and GA4 doesn’t solve this problem. UA was notorious for sampling data, and it’s still unclear how much sampling GA4 data will be subject to.

GA4 provides one-size-fits-all features around identity resolution and predictions that aren’t transparent to end users. This means some of your most important decisions are based on significant unknowns that might not accurately reflect your business model or customer journey.

Missing User Journey

Marketing and data teams need to capture the full user journey to get valuable insights into how users interact with their websites and apps. Use cases like identifying stages of the conversion funnel and segmenting users based on behaviors or preferences require data from various sources.

GA4 does not capture all user interactions or events, especially if they occur outside of the website or app being tracked. For example, interactions that happen on third-party platforms or devices that are not integrated with GA4 may not be captured, resulting in gaps in the user journey.

Analytics with first-party data on the data warehouses

A warehouse-native approach to analytics lets you centralize your company's data in one location to create powerful analytics on vast, diverse, high-quality datasets. This enables you to address complex queries that may go beyond the capabilities of analytics tools like GA4.

Components of a warehouse-native data analytics stack to replace GA4

A warehouse-native data analytics stack is made up of a few key components that facilitate the collection, storage, and movement of your data to make it easier to analyze:

  • Behavioral data ingestion
  • Cloud data warehouse
  • Data modeling
  • Visualization
Warehouse native architecture diagram

Behavioral data ingestion

The first step in the warehouse-native analytics data flow is Ingestion. Leveraging RudderStack, you can streamline behavioral data collection by instrumenting your website with a single SDK. This SDK captures event and user identification data in one instance, which can then be sent to both your data warehouse and other SaaS analytics tools (even GA4). This ensures that you have a single, unified source of truth for your data across multiple platforms.

RudderStack centralizes data from diverse sources, encompassing first-party, second-party, and third-party data, in a data warehouse, establishing a unified and reliable source of truth for advanced use cases such as identity resolution and personalized experiences across multiple devices and touchpoints.

Cloud data warehouse

A cloud data warehouse, like Snowflake, is where you’ll centralize all of your customer data. By consolidating raw traffic, behavioral data, and other customer data from various sources such as CRM tools or ad platforms, into a unified repository, you gain complete ownership and control over your data. This empowers you to perform comprehensive analyses and generate valuable insights. This approach also facilitates seamless transitions and onboarding of different analytics tools for various teams, because each tool can access the entirety of your historical data.

Data modeling

Data modeling is all about organizing, transforming, and grouping your events to answer the questions teams formerly used GA4 to answer and to unlock answers to more complex questions tools like GA4 can’t answer. For example, say your marketing team wants to understand which channels drive the most content views (i.e., organic search vs paid social). You’ll need to count distinct pageviews for each relevant URL, then group them by referring domain. RudderStack’s event schemas are specifically designed to make this kind of modeling much easier.

One of the most popular data modeling tools is dbt (data build tool). dbt enables analytics engineers to transform data in their warehouses by simply writing SQL select statements. dbt takes those SQL codes and runs them against your data warehouse to create tables and views.

After building out models and creating tables that represent a specific use case analysis, RudderStack's Reverse ETL pipeline can be leveraged to send these models to different downstream tools like Braze or Customer.io for activation.

Visualization

Visualization is an integral part of the data modeling process. Once you have transformed the data in your warehouse into meaningful datasets, the next step is to leverage a visualization tool. This tool enables you to visually represent the datasets in the form of charts, maps, graphs, or images, facilitating the extraction of valuable insights from the data.

Before selecting a visualization tool for your data warehouse, consider factors like ease of use, learning curve, data warehouse compatibility, flexibility, customization options, and suitability for your specific use case. Hex is an excellent choice as it streamlines the analytics workflow, empowering your team to generate insights, make informed decisions, and drive progress efficiently.

With Hex, you can replicate an entire GA4 dashboard using your warehouse data. Here’s an example of a Hex dashboard built on top of Snowflake with data collected by RudderStack:

You can also view the full 1:1 GA4 analytics dashboard on Hex here.

Owning the analytics infrastructure

Owning your analytics infrastructure gives you full control and flexibility to adapt to the needs of your business without facing the limitations of vendor lock-in. While plug-and-play tools like GA4 sound convenient, getting a deeper level analysis of your data can only be achieved by centralizing data from different sources within a data warehouse.

By employing a data warehouse, a first-party data ingestion tool like RudderStack, and a powerful visualization tool, you can safeguard your analytics for the future, ensure a comprehensive understanding of the entire user journey, and drive more sophisticated analytics across your business.

Check out our knowledge base article for a step-by-step guide on how to set up warehouse-native analytics with RudderStack, Snowflake, dbt, and Hex.

May 16, 2022
Sara Mashfej

Sara Mashfej

Developer Relations at RudderStack