The Evolution of the Customer Data Platform

Blog Banner

There are many approaches to delivering the promise of a customer 360. It’s a hard problem to solve, as evidenced by the history of the Customer Data Platform, and until now every Customer Data Platform solution has come with significant drawbacks.

The constraints of legacy SaaS CDPs led many companies to try to build their own capabilities in-house, but most were slowed down by the engineering capital required to build and maintain these systems. More recently, The CDP faced a great unbundling, and the Composable CDP emerged as a sort of happy medium between these options. While the composable approach is a step in the right direction, it doesn’t address the full picture.

The CDP is still rapidly evolving, but one thing is clear: The data warehouse (data cloud, data lake, data lakehouse) will play a central role in its future. What’s less clear is what the surrounding system will look like.

Here, we’ll unpack the prevailing customer data platform approaches and introduce a new approach that we believe best delivers the end goal – easy activation of complete customer profiles.

The Legacy SaaS CDP

The legacy CDP was born in response to the SaaS boom as a way to aggregate data from data silos into one place. But these systems were black boxes that ultimately created another data silo. Early products did succeed in building a more comprehensive customer view, but they still provided incomplete data and were largely useful only to marketing teams for specific use cases.

Today, the Customer Data Platform Institute defines a CDP as “packaged software that creates a persistent, unified customer database that is accessible to other systems.” Today’s CDPs are more flexible than their predecessors. However, packaged SaaS CDPs are fundamentally limited by their architecture, and they are still made primarily for marketing users.

Because legacy CDPs are geared towards marketing, they don’t expose customer data in a manner conducive to building applications for more sophisticated use cases like user journeys, attribution, ML models for churn prediction, and product recommendations.

It’s worth noting that even legacy CDP vendors are recognizing the paradigm shift towards the data warehouse and working, post factum, to add data warehouse support to their existing systems.

The in-house build

With inflexible and limiting off-the-shelf solutions, a tempting option for companies with engineering resources is to build CDP capabilities in-house. This option may seem like a good one, but in reality, most companies get overwhelmed by the magnitude of the project and its ongoing maintenance. An MVP solution may be easy enough to hammer out, but these projects seldom scale, and if they do, they become a significant resource drain. As growth accelerates, data volume grows, integration requirements expand, privacy regulations become harder to meet, and error handling gets increasingly complex.

Building an internal system at scale can take years, and the maintenance overhead is enormous. Because of this, building in-house is not a viable option for most companies.

The composable CDP

The Composable CDP emerged in 2022 as an alternative to inflexible, packaged systems and cumbersome in-house builds. While not a foolproof solution, it does get one foundational element right – it places the data warehouse at the center of the customer data stack. But the premise of the composable CDP is essentially a wholesale unbundling, and it goes too far. The composable CDP separates and isolates each major component of a CDP:

  • Streaming (and real-time transformations)
  • ETL
  • Warehouse transformations
  • UI/Segmentation
  • Activation/rETL
  • Storage

This delivers on flexibility, but comes with a few critical drawbacks. The fragmented nature of the system means you still have issues with incomplete and incompatible data. Plus, managing data quality across a significant number of separate vendors can become problematic. More importantly, the composable system requires the data warehouse as an intermediary, so it cannot support real-time use cases.

The Warehouse Native CDP

The Warehouse Native CDP is a packaged platform that runs directly on the data warehouse and helps data teams deliver value at every stage of the data activation lifecycle: collection, unification, and activation.

Like the composable approach, The Warehouse Native CDP solves the data silo problem by building around the warehouse, but it deploys the integration, real-time transformation, unification, and activation layers as a connected, governable, and observable end-to-end system.

Leveraging the data warehouse - to create a customer 360 - eliminates data silos and allows marketing (and every other team) to use their tools of choice. More importantly, downstream teams can use these data activation tools to their full potential because they have access to complete, enriched customer profiles. What about the real-time use cases? The Warehouse Native CDP includes event stream pipelines that can send data in real-time to the warehouse and directly to other destinations for real-time use cases.

Because the Warehouse Native CDP is an end-to-end system, you don’t have to invest time and money building infrastructure or bridging the gaps created by siloed legacy CDPs. Moreover, you still have full control over both pipelines and the modeling of customer profiles in your own warehouse.

The Warehouse Native CDP provides flexibility without compromise and delivers:

  • Seamless integration with every tool in the stack
  • Support for real-time and batch
  • Automated identity stitching and customer 360
  • Single observability plane

The best customer data platform architecture: The Warehouse Native CDP

The Warehouse Native CDP provides you with end-to-end tooling that helps the data team drive value at every point in the data activation lifecycle:

  • Collection
  • Unification
  • Activation

Below we’ll break down the architectural components of the Warehouse Native CDP by stage.

Data collection

  • Real-time event streaming pipelines
  • Batch ETL pipelines
  • Data governance

The Warehouse Native CDP delivers value beyond the simple utility of data pipelines. Transformation and data governance features allow you to ensure data quality at the source. It also ensures that data collection follows standardized schemas designed to populate the identity graph.

Data unification

  • Identity stitching
  • User features
  • Customer 360

As an end-to-end solution, the Warehouse Native CDP can leverage the power of known schemas to automate complex modeling for identity resolution and user features. Our customer 360 solution – Profiles, leverages all of the data in your warehouse to automatically build an activation-ready customer 360 table, and it enables you to easily create and update user features for use downstream.

Data activation

  • Real-time integrations
  • Reverse-ETL
  • ML outputs

Not only does a warehouse native approach enable you to create value from data faster in your data store, but activation pipelines also make it easy to push that value to every team and tool across your organization to drive bottom-line impact.

Built to help you deliver on your data strategy

Modern data leaders are rapidly adopting warehouse native architecture because it leverages the best ideas from both legacy CDPs and in-house builds for identity resolution to deliver a combination of benefits that no other approach can.

  • Complete, trustworthy data – with automated pipelines and the warehouse as the central, transparent data store, you can eliminate silos and the low-value engineering work of building custom infrastructure.
  • Flexibility and control – solving identity resolution with dedicated tooling on the data warehouse makes it easy for you to update your identity graph and user features to keep pace with the changing needs of your while maintaining full control and visibility.
  • Privacy and security – It has never been more critical to protect sensitive customer data, and your identity graph will be full of it. Building on the modern data cloud allows you to leverage your own data store and all of its world-class security features.
  • Machine learning ready – A ready-made identity graph and rich set of user features in your data warehouse is a force multiplier for AI and ML teams, especially when they can immediately operationalize those in ML tools that are directly integrated in the same data cloud environment.
Try the Warehouse Native CDP today
Sign up for a free RudderStack account today to explore the Warehouse Native CDP
May 31, 2023
Brooks Patterson

Brooks Patterson

Product Marketing Manager

Eric Dodds

Eric Dodds

Senior Director of Product Strategy