Skip to main content
Log in


What Is a Customer Data Platform (CDP)?

Learn everything there is to know about CDPs including data ingestion, identity resolution, audience management, and data sharing.

Andrew Jesien.

Andrew Jesien

Luke Kline.

Luke Kline

August 19, 2022

13 minutes

What Is a Customer Data Platform (CDP)?.

Today every vendor in Martech is trying to sell you a solution in the hopes of creating a single unified view of your customer. The problem is building and operationalizing a single unified customer profile is complex. With an ever-increasing number of both first-party data and third-party data sources, this is only getting more difficult. If you've ever had to deal with customer data at a company (regardless of the size and number of business departments), then you understand this pain all too well.

For marketers, an all-in-one platform that can solve countless data challenges sounds almost too good to be true. On the other hand, data teams have been trying to use the data warehouse for this exact purpose for years. Unfortunately, most people never make the connection between the two.

What is a CDP?

Chances are you're collecting behavioral data through a website, app, or server. The goal of a Customer Data Platform (CDP) is to collect and consolidate all of your individual customer data across all of your different data sources into a centralized and persistent database, so you can send it to your downstream business tools and build personalized experiences. Although CDPs are data tools, they're really personalization engines that specialize in helping marketers create multiple different customer touchpoints. At their core, CDPs ingest data, build segments and cohorts on top for specific audiences, and then distribute them.

A CDP is an all-in-one marketing and data platform that lets you store and consolidate all of your data into a centralized customer database so you can synthesize it into a single unified profile and make that data available to your other marketing systems and business applications.

Image of CDP Architecture

CDP Architecture

Why Were CDPs Created?

This may be a bit controversial, but most CDPs were primarily created by accident. In fact, basically, every major CDP vendor stumbled into this category, initially starting as a tangential component of a CDP solution (e.g., customer relationship management platforms "CRMs," infrastructure tools, databases, tag managers, or marketing tools). As these SaaS solutions expanded their product, they inevitably ended up building a persistent record of the customer. Once they got to a certain maturity level, they remarketed themselves as a CDP and expanded the platform so that it could integrate with other data sources and tools.

pedram's tweet.png

All CDPs are built around flexible APIs that let you ingest data into the platform. These APIs are also optimized to integrate with various SaaS applications and third-party tools, which means it's straightforward to send customer data to your various business tools.

Although CDPs are data tools, they're tailored toward marketers as a way to bridge the gap between data distributed across systems. CDPs were designed to help you build and segment custom audiences for personalized campaigns, but they're also used to automate business processes so you can create consistent customer experiences.

Until CDPs came around, developers had to build an API integration whenever they wanted to send data from one system to another. Managing and maintaining this at scale was nearly impossible. When CDPs came along, they introduced a single standardized API that made it easy to ingest data from any source and push it back out again.

How Does a CDP Work?

Although many CDP offerings are available on the market, all CDPs share several core components. Every CDP supports both sources and destinations. Sources represent the data flowing into CDP and destinations represent the end location where that data is sent for activation.

Data Ingestion

Since CDPs are unified customer databases, they also need a way to ingest data. This is achieved through an exposed API that allows you to import events and attributes of your customers. Although most CDPs have some out-of-the-box capabilities to "listen" or pull data from other SaaS tools, the primary method of ingestion often involves tracking SDKs that need to be "instrumented" into your website or mobile apps directly by product engineers.

Image of CDP data ingestion

Data Ingestion

Identity Resolution

Identity resolution is essential to any CDP because it allows you to unify your different customer data sets across ingestion channels. Most CDPs maintain identity graphs of user profiles to map unique identifiers like cookies, mobile advertising IDs, and PII (personal identifying information) back to a single customer profile.

Image CDP Identity Resolution

Identity Resolution

For example, suppose a user visits multiple pages on your website and eventually signs up to place an order. In that case, you can link that anonymous customer behavior to a specific user profile – thus giving you a complete 360-degree view of every action in the customer journey. Beyond this web session, you will likely have data on this customer living in other places (e.g., your application database, support tools, etc.), so ideally, you're able to import that information into the CDP as well.

Audience Management

Without audience management, a CDP is just "Customer Data Infrastructure." CDPs are advantageous because they can consolidate and combine your customer data. Still, marketers need a way to orchestrate and manage that data, so all CDPs come equipped with an audience builder. An audience builder is a visual user interface that lets you define customer segments and personas without SQL.

Image of CDP Audience Management

Audience Management

Data Sharing

CDPs wouldn't be useful unless they helped you act on this data, so in the same way that these platforms make it easy to ingest and collect data, they also share data with your downstream go-to-market tools easily (e.g., Salesforce, Hubspot, Marketo, Iterable, Braze, etc.)

image of CDP data sharing

Data Sharing

Types of CDPs

All CDPs fall into a few broad categories. The CDP Institute defines these categories as:

  • Data CDPs gather data from various source systems, link it to specific customer identities, and make it available to external business applications via audience segments.
  • Analytics CDPs offer general data assembly and collection, but their capabilities extend to machine learning, journey mapping, predictive modeling, and revenue attribution.
  • Campaign CDPs are focused solely on segmentation. Their core capabilities center around analytics and customer treatments. They're used to orchestrate customer interactions across marketing channels (e.g., personalized messages, outbound marketing campaigns, real-time interactions, recommendations, etc.)
  • Delivery CDPs provide all of the capabilities of a conventional CDP, but they specialize in message delivery (e.g., email, website, mobile apps, advertising platforms, CRMs, etc.)

CDP Use Cases

The end goal for all CDPs is Data Activation. That means synthesizing and organizing your data and making it available to the tools that run your business processes. CDPs offer a ton of really cool features, but to fully realize their value, it's essential to look at some use cases they can help solve. Here are a few questions for you to think about:

  • What happens if your sales team wants access to product usage data (e.g., last login date, workspaces created, messages sent, pages viewed) in Hubspot/Salesforce?
  • How do you know your sales reps are prioritizing the right leads? What actions do potential customers take?
  • What do you do if your marketing team wants to upload a list of high-value customers to Google Ads for retargeting or lookalike audiences?
  • What happens if they want to build a lifecycle marketing campaign for specific customer actions, or even personalize in-app experiences with product recommendations?
  • What can you do to reduce customer churn and increase customer retention? How can you ensure our customer success team identifies red flags before they happen and prioritizes the proper tickets?
  • How can you reduce the time your data team spends building and maintaining API integrations between your various tools so they can focus on driving business value?

These are just a few reasons why you might need a CDP, but if you're still reading this post, you're probably searching for an easy solution to keep data in sync across all your business applications. Either way, there are a few factors you'll want to consider before buying a CDP.

The Downside of CDPs

Most people don't know this, but most CDPs are built on top of data platforms like Google BigQuery and Snowflake. CDP platforms simply provide a UI on top of another cloud provider's infrastructure. In addition, CDPs only support event collection/ingestion, which means they're not a reliable source of truth. Until now, CDPs have dominated the customer data market, but they're becoming less and less viable because they have several fundamental flaws.

CDPs Are Not a Single Source of Truth

Whether you're a D2C brand, B2B SaaS company, an e-commerce marketplace, or even a massive bank like Capital One, there's a high probability that all of your data already lives in your data warehouse.

You're probably already using the customer data in your warehouse to power analytics use cases for your key stakeholders. After all, this is the location where your data team lives daily, modeling and transforming your data to create actionable insights.

CDPs claim to be a single source of truth, but CDPs don't replace data warehouses. There's nothing about having separate databases of customer information for different departments that spells "single source of truth."

image of warehouse vs. CDP

Two Sources of Truth

To build a complete 360-degree customer profile, you'll inevitably need to leverage the data in your warehouse. While some CDPs support importing data from your data warehouse, this results in additional data latency and data freshness problems.

CDPs Are Expensive

Pricing for CDPs is usually based on the number of customer records, which means you pay based on volume. You're also forced to pay for an additional storage layer even though all of your data already lives in your warehouse. And if you want to take advantage of any of the other cool features that CDPs offer, specifically around identity resolution and audience management, you'll be forced to factor this into your overall costs as well.

CDPs Are Not Flexible

Most CDPs are built around rigid data models and usually only offer two core objects: users and accounts. Chances are, your data models are not so cookie-cutter. In many scenarios, users can belong to multiple accounts, and accounts can have subaccounts. You’ll probably need to associate your users with other entities like subscriptions, playlists, workspaces, orders, products, IoT devices, etc. Doing this in a CDP that enforces its own proprietary data model is nearly impossible.

Image of CDP Data Models

Rigid Data Models

CDPs Own Your Data

CDPs store data outside of your own cloud infrastructure and offer restricted access to your customer data. They only expose very specific actions that are purpose-built for marketing workflows. Since all of the components of a CDP are tightly bundled together, you're locked in and subject to the whims and changes of your CDP vendor in terms of how you can use your data. This is not even mentioning data privacy and data residency laws around GDPR, CCPA, or HIPAA.

CDPs Are Siloed

Since CDPs are tightly bundled solutions, they don't always integrate nicely with the rest of your technology stack. For example, if you send a bunch of bad customer events to a CDP, you're limited to the features that the platform offers to clean your data set, and in many cases, the transformations you need to run don't exist, so you're forced to file a support ticket. Once you've spun up a CDP, there is no undo or unmerge button for your data which means you're forced to delete your whole instance or reconfigure your settings so you can reload all of your historical events.

CDPs Don’t Produce Immediate Value

Implementing a CDP can take over six months, and you have to train your data team and marketing team to use a new tool. Since CDPs don't replace data warehouses, your data teams are required to manage and maintain a second source of truth, and your marketing team has to switch between a CDP and a Marketing Cloud to orchestrate and segment audiences. It doesn't make sense to force your data and marketing teams to perform the same tasks twice in two separate applications or learn the proprietary nuances of a CDP.

When Does a CDP Make Sense?

If you have zero cloud infrastructure and minimal engineering resources, a CDP is probably the highest value product you can implement to have an immediate impact. However, since CDPs are not built for analytics use cases, you'll inevitably need a data warehouse, which means you'll eventually have to decide where you want your single source of truth to be. Is it a data warehouse, or is it a CDP? Adopting a CDP is only a quick fix, and eventually, you'll need a more flexible solution that scales as your business grows.

Why the Warehouse Should be Your CDP

The number one reason that your CDP should be your data warehouse is the fact that your data warehouse is already a CDP. The most accurate customer profile already lives in your warehouse because it houses all of your data – not just events. It might not have all of the out-of-the-box functionality of a conventional CDP offering, but thanks to innovations in the modern data stack, you really just need a way to sync the data in your warehouse to your various SaaS applications and digital channels.

image of modern data stack

A Modern Data Stack & Reverse ETL

As a Data Activation platform powered by Reverse ETL, this is precisely where Hightouch comes into play. Hightouch queries directly against your warehouse and syncs data to over 100+ destinations. Hightouch supports any custom data model. You can select specific tables, write SQL, or leverage your existing data models through dbt.

Hightouch Modeling Methods

Hightouch Modeling Methods

Once you've defined your modeling method, all you have to do is map the appropriate objects to the proper columns/fields in your end destination.

mapping data to destination fields

Mapping Data in Hightouch

Hightouch even offers a visual audience builder so your non-technical users can easily build segments using the data in your warehouse. You can schedule your syncs manually, using a cron expression, on a set interval, or even after your dbt jobs have finished running in your warehouse.

Image of Hightouch Audiences

Hightouch Audiences

With Hightouch, you can take advantage of your current tech stack and replicate or segment the data in your warehouse to your downstream business tools. You can create a free Hightouch workspace and start activating your data today!

More on the blog

  • What is Reverse ETL? The Definitive Guide .

    What is Reverse ETL? The Definitive Guide

    Learn everything there is to know about Reverse ETL, how it fits into the modern data stack, and why it's different than ETL.

  • The CDP As We Know It Is Dead: Introducing the Composable CDP.

    The CDP As We Know It Is Dead: Introducing the Composable CDP

    Learn why CDPs are dead and how you can take advantage of the data warehouse.

  • What is Data Activation?.

    What is Data Activation?

    Learn everything to know about Data Activation, what it is, why it matters, and how you can get started activating your data today.


Sign up for our newsletter

Ready to activate your data?

Get startedBook a demoBook a demo

Recognized as an industry leader
by industry leaders

We are proud to be recognized as a leader in Reverse ETL and Marketing & Analytics by customers, technology partners, and industry analysts.

Gartner 'Cool Vendor', 2022..
Snowflake 'Marketplace Partner of the Year', 2022..
G2 'Leader', Fall 2022.
G2 'Leader', Winter 2023.
Snowflake 'One to Watch for Activation and Measurement', 2022.
Fivetran 'Ecosystem Partner of the Year', 2022.