Skip to main content
Log inGet a demo

Face it: Your data isn't perfect. And that's OK.

Perfectionism stops you from getting value from your data today. You might end up waiting a lifetime for a “single source of truth”

Kashish Gupta.

Kashish Gupta

May 31, 2023

10 minutes

Messy data is the norm, not the exception

I talk to hundreds of data and marketing professionals every year and the single most frequent complaint about customer data I hear from teams all around the world is: “Our data isn’t ready.” It’s never organized enough or documented enough. It’s spread across too many tables. It’s spread across too few tables.

I used to just shrug my shoulders and say - “cool, I’ll check back in 6 months” and inevitably I’d come back 6 months later and hear “Our data still isn’t ready.” Then a few months after that and hear again: “Our data still isn’t ready.”

Here’s what I’ve learned: all of these people are doing their best and want to help their companies, but their data will never be ready.

There is no perfect clean end state of data maturity that businesses can attain. Nirvana isn’t the perfect customer 360 or golden record. Nirvana is actually getting something done that helps your business.

In this post I’m going to explore why teams feel like they can’t ever get their data ready enough to activate, why the standard approach of starting from scratch fails, and what to do about it instead.

Why can’t we ever perfect our data?

Data optimization is an eternal race, driven ever forward by two trends: growing data demands and evolving technology.

The amount of data in the world is doubling every two years. Any project to clean and standardize data within a company almost immediately becomes out-of-date as the company's offerings and consumer interactions continue to evolve. It doesn’t mean that these projects aren’t worthwhile– it just means it’s important to take an iterative approach.

Companies may start with simple products and customers, but complexity grows as offerings diversify and customers return. The company works to better personalize advertising and lifecycle marketing, requiring them to better track each interaction at individual and household levels. The company expands into new channels, and works harder and harder to figure out the right way to solve customer attribution. A thriving company is ever-changing, and so its data needs continue to evolve as well.

Companies have access to better data technology than ever before.. The emergence of the cloud data warehouse (Snowflake, Databricks, Google BigQuery, etc.) has enabled companies to store enormous amounts of data at minimal cost, paying only for the queries run. This simply wasn’t possible before. In reality, few companies have fully centralized every bit of data in a single data warehouse, but at the same time, the data warehouse is often the most complete source of truth in the organization. This is yet another example of the always "in progress' nature of data technologies.

I'll be honest with a very personal example: Hightouch's data that we use for marketing and sales is far from perfect. We often encounter questions that do not yet have the proper tracking to solve and still face gaps in connecting different data sources. Even a technology company with a mission focused on data activation has an ever-evolving internal data strategy. A little bit of messiness in data is the norm, not the exception.

Starting from scratch isn’t the best approach

Given the apparent mess of your current data, you may be tempted to clean house and find a new approach. Customer Data Platforms (CDPs) offer companies a fresh start for their customer data. CDPs aim to give marketers and other business users access to timely customer data in their tools like ad and email platforms. CDPs operate by collecting new customer data, transforming that data with operations like identity resolution, and activating that data to marketers' downstream tools.

There’s one wart– traditional CDPs can only operate on data they understand. Today, to onboard with a traditional CDP, you must re-collect your data in the CDP’s way. You'll spend months creating a standardized set of customer data, which sounds great: a clean start to get to clean data. However, this approach suffers from three key drawbacks:

  1. CDPs can only operate on a standard set of data. Your business is unique and so is your data. A one-size-fits-all framework rarely works. For example, CDPs are oriented around customers and behavioral events (e.g. “Page Viewed”). If you want to leverage other entities, such as households, accounts, or custom data like product inventory or your customers’ pets– you’re out of luck.
  2. CDPs will only ever have a fragment of your data. CDPs provide great tools to help you collect raw behavioral event data from your websites and mobile apps, but that’s only a portion of a business’s data. With only a fragmented view of customers, marketing teams miss out on the potential to deliver personalized experiences using valuable data from point-of-sales systems, backend systems, data science models, and other critical data sources.
  3. CDPs create multiple sources of truth. Marketers use CDPs for activation, but marketing data needs don’t end there. BI tools like Tableau or Looker that sit on top of a data warehouse are an integral part of every marketing stack for reporting and analytics. This means that marketing activation relies on a different set of data than marketing analytics (and the rest of the business!), which causes confusing discrepancies and more room for error.

Bundled CDP Architecture

CDPs weren’t wrong– they were built in a different time when companies had access to different technology. The CDP category emerged in 2013– before cloud data warehouses like Snowflake, Databricks, Google BigQuery, and Amazon Redshift enabled companies to collect and prepare all their company’s data. CDPs served a critical need when they emerged by enabling marketing teams to activate clean customer data, but the modern data stack has since outgrown them.

The inflexibility of CDPs, coupled with the data silos they create, leads only 58% of companies with a CDP to say it drives significant value, according to the CDP Institute. Rather than spend months implementing a separate band-aid data solution with a CDP, you can instead begin to activate the data you already have today to achieve immediate value.

How you can activate your data today (and improve it as-you go)

More likely than not, there is existing data around your business that you know is valuable. By simply activating this, you can start powering your company’s operational and marketing use cases today, like syncing audiences of purchasers to ad platforms to suppress them from future advertising.

Rather than solve for each use-case as a one-off technical build or, worse, by manually exporting data from your system and uploading CSVs elsewhere, Data Activation platforms like Hightouch can help you automatically sync customer data points like audiences from your sources (most commonly your data warehouse) over 150+ destinations, without writing a line of code.

Activating Audiences with Hightouch

Ultimately you'll develop a virtuous cycle by using the data you have today, identifying the data that would further enhance your work, and improving your underlying data systems as you go to meet your new needs.

  1. Connect Hightouch to your data, wherever it’s stored. Hightouch can access data stored in 30+ sources, ranging from a true cloud warehouse like Snowflake to ad-hoc solutions like Google Sheets. Wherever your data is, we’ll make it painless for you to use it within just a few minutes.
  2. Define Customer Audiences. Hightouch’s Customer Studio allows anyone to explore, create, and analyze customer audiences based on values already in your data set– no coding or SQL required. Alternatively, if you’ve already defined audiences in existing SQL queries or in BI tools like Sigma, Tableau, or Looker, you can import them with a single click.
  3. Activate Audiences to Tools. Once you’ve created an audience, you can sync it to 150+ AdTech and MarTech tools. Whether you want to use a customer list in Google Ad Manager or Hubspot, we’ll keep each audience up-to-date based on your user criteria.
  4. Run Campaigns. Put those audiences to work in your downstream tools.
  5. Measure Success. What results were you able to drive with these campaigns?
  6. Think of New Use Cases. When you start to see the results of your initial audiences, you’re going to want to make more and try new approaches. For many of your potential use cases, you probably already have the data you need– you can go right back into Hightouch to define a new audience.
  7. Improve Data Collection and Modeling. For use cases that you don’t have proper data collection or modeling to support, you can collaborate across business and data teams to invest in your core internal company data. You improve and clean your data warehouse as-needed, only when you will be able to immediately see a return from that investment.

The Data Activation Virtuous Cycle

Hightouch enables business users to extract as much value as possible from your company's current data while building more advanced use cases one step at a time. The gaps in data collection that emerge as you explore more sophisticated use cases will drive reinvestment in your underlying data ecosystem, creating a positive feedback loop of business-oriented data investments that drive immediate value.

Using Hightouch to Organize Data

Let’s be real: most data warehouses look like this, or even more complex:

Complex data structure

Source: https://docs.oracle.com/cd/E36434_01/ACI.10-1-2/ATGDataWarehouse/html/s0836searcherd01.html

It’s overwhelming for marketers, advertisers, and other business users to know which tables and fields to rely on for decision making.

Hightouch’s Customer Studio allows data teams to define which tables and relationships matter most for their business users. From this initial schema, business users can build hundreds of audiences, powered by data that is vetted and organized within the Hightouch interface.

Hightouch Schema Builder

This powerful schema builder allows data-savvy users who set up Hightouch to focus initially on exposing the most essential data. We have companies using Hightouch that have 10k+ tables in their warehouse. Those companies use the schema builder to select the handful of tables that matter for their specific use-cases and then create relationships between each entity to connect them for effective audience building.

The schema builder is just one more example of how Hightouch enables everyone to act on their data and extract value from it - regardless of the state of the underlying raw data tables.

Getting Started

Traditional CDPs won't fix your data. They'll force you to collect new data following their methods. You won't be able to act on that new data for months, and it will be separate from the already valuable and likely messy data you've been collecting for years. By activating your data as it is today with Hightouch, you'll start getting value immediately and see the most impactful reinvestments you can make in your existing data ecosystem to reap benefits as you go.

Hightouch is building a data future of Composable CDPs. Rather than buying a one-size-fits-all data solution, you accomplish the true goal of a CDP– actionable customer data in all of your tools– by activating directly from your existing stores of customer data. If you have any metric you can rely on today, you have data that’s worthy to act on and monetize in your downstream tools. Speak with one of our solutions engineers to see how we can help you untangle your data today.

More on the blog

  • Announcing Hightouch Schemas.

    Announcing Hightouch Schemas

    Visually define customer data models to unlock audience building and activation from ANY data set in minutes.

  • Write Once, Use Anywhere – the Future of the Cloud Data Warehouse.

    Write Once, Use Anywhere – the Future of the Cloud Data Warehouse

    Why we are investing in building features like the Personalization API to make the cloud data warehouse accessible to any team, anywhere.

  • Friends Don’t Let Friends Buy a CDP.

    Friends Don’t Let Friends Buy a CDP

    How spending the first half of his professional career at Segment drove Tejas Manohar to disrupt the 3.5 billion dollar CDP category.

Recognized as an industry leader
by industry leaders

G2

Reverse ETL Category Leader

Snowflake

Marketplace Partner of the Year

Gartner

Cool Vendor in Marketing Data & Analytics

Fivetran

Ecosystem Partner of the Year

G2

Best Estimated ROI

Snowflake

One to Watch for Activation & Measurement

G2

CDP Category Leader

G2

Easiest Setup & Fastest Implementation

Activate your data in less than 5 minutes