Understanding Data Enrichment: A Comprehensive Guide | Hightouch
Learn what data enrichment is, why it matters, and how you can implement it today.
July 8, 2022
Over the last five years, the technology landscape has changed dramatically. Today, nearly every organization is collecting data from numerous disparate systems and consolidating that information into a centralized data warehouse.
The rapid adoption of the modern data stack has completely changed the game when it comes to analytics and Business Intelligence (BI) tools, which have made it easier than ever to consume that data.
These days, you’ve got more data at your disposal than ever before. If you’re only ever consuming this data through a dashboard though, you’re missing out on huge potential because you could be empowering your business teams and activating this data across your organization using data enrichment.
What is Data Enrichment
The easiest way to understand data enrichment is to look at the various SaaS applications across your organization. At a basic level, you can use a customer relationship management (CRM) platform like Salesforce or Hubspot for both sales and marketing purposes. Most of the time, these platforms collect and house high-quality data on various objects like contacts, companies, and deals. These properties typically have a set of sub-fields within each category. Here are a few examples.
- Contacts: first name, last name, email address, job title
- Deals: deal size, deal owner, deal stage
- Companies: revenue, domain, account owner
All SaaS tools operate similarly, with the only difference being in the data they store and the use cases they solve. In its simplest form, data enrichment is the process of enhancing existing datasets and tools with 1st and 3rd-party data. Data enrichment creates a golden customer record and enables your business teams to access customer data in near real-time in the tools they rely on.
In its simplest form, data enrichment is the process of enhancing existing datasets and tools with 1st and 3rd-party data. Data enrichment creates a golden customer record and enables your business teams to access customer data in near real-time in the tools they rely on.
Data enrichment democratizes the customer data sets that live in your warehouse and creates a single unified view across your entire organization, enabling your business teams to leverage it on a day-to-day basis and drive meaningful business goals. Emphasizing data enrichment processes improves data accuracy by adding additional context to your original dataset, and this translates into more customer personalization.
Why is Data Enrichment Important?
Reports and dashboards help identify trends and provide deeper insights so you can make data-informed business decisions, but less so when you need to make your data actionable. They only show a zoomed-out view of your customer; the data is not usually associated with individual users or customers.
For example, you might have a dashboard that shows the total products ordered in the last 30 days or recent signups, but if this information is not available at the individual level, what good is it? What happens if you want to answer the following questions? Who is the most active user in an account?
- Which customers have viewed X page?
- What songs did user X listen to in the last week?
- How many messages did X account send in the last 30 days?
- What is the income level of customers who purchased X product?
To answer these questions, your business teams are forced to hop back and forth between various tools because the data warehouse houses your key business logic, and it’s only accessible to the technical members of your analytics team who can write SQL. Data enrichment ensures that the single source of truth that lives in your warehouse is made readily available to every team in your business, giving them a complete profile and historical record of every customer or user that’s touched your business.
Types of Data Enrichment
Before adopting a data enrichment strategy, you need to understand the difference between first-party data and third-party data. 1st party data is data that you own and collect internally across your various internal and external sources whether it’s from your website, app, SaaS tools, or an internal database.
3rd party data is data that is collected by an external party. In many cases, 3rd party data is often purchased from specific providers. A great example of this is the Snowflake Marketplace. Within these two categories, there are three core types of data enrichment:
- Behavioral Data enrichment
- Demographic Data Enrichment
- Geographic Data Enrichment
Behavioral Data Enrichment
Behavioral data enrichment focuses on adding customer behavioral patterns to existing user profiles in your various SaaS applications. Behavioral data usually takes place on your website or in your app, representing the key events your customers or users are taking in the buying cycle.
For the most part, behavioral data is often collected via third-party cookies and web pixels, but recent changes in consumer privacy to behavioral data platforms like Snowplow.
Web events can include anything from pages viewed, links clicked, session length, items added to cart, etc. On the other hand, app data or product usage data can include anything from:
- Last login date
- Messages sent
- Signup date
- Workspaces created
- Subscription type
Behavioral data enrichment can also include data from your marketing tools or ad platforms (e.g. email open or ad click). These are just a few examples, but you can probably fill in the blank for what is most relevant to your business. Behavioral data enables you to target customers based on the unique actions they are taking in the customer journey.
Demographic Data Enrichment
Demographic data enrichment is much broader and focuses on enriching the data around your customer datasets. Demographic data can include anything from:
- Job title
- Deal stage
- Marital status
- Physical address
- Income level
- First meeting
- Lifetime value
- Average order value
- Number of orders
The easiest way to understand demographic data is to think of it as metadata about your customers or users. Demographic data gives your business users a full picture of the customer.
Geographic Data Enrichment
Geographic data enrichment is related to different customer addresses, countries, cities, zip codes, IP addresses, or even time zones. With geographic data enrichment, you can target specific geographic groups and build personalized experiences for specific locations based on the interest in that area.
Data Enrichment Use Cases
There are a near-limitless number of use cases you can solve with enriched customer data, but the core use cases tend to be centered around your business teams like sales, marketing, and support, so being able to enrich your SaaS applications with data directly from your warehouse can have huge benefits.
Sales is probably one of the most popular use cases for data enrichment because there are so many different benefits. For example, by enriching your CRM with key behavioral data and events you can power your product-led-growth sales motion, giving your sales team the ability to monitor product usage data for upsell opportunities, identify red flags to reduce churn, and build meaningful customer relationships.
You can even simplify this process further by setting up a lead/account scoring model to filter for qualified leads. Since platforms like Hubspot give you the ability to automate different business processes through custom workflows, you can automatically route your leads to the appropriate reps and notify them when an action needs to be taken (Vendr uses Reverse ETL for this exact purpose). And since CRMs allow you to automate every touchpoint, you can build personalized actions for every step in the buying process.
Ad targeting and ad spend are probably the number one use cases for data enrichment, especially as it relates to improving your return on ad spend (ROAS) and lowering your customer acquisition costs (CAC).
Since ad platforms are largely based on machine learning algorithms, you constantly have to feed information into the platform so when you're able to share more data about your target users and existing customers your ads inevitably end up being more accurate and yield better results. For example, by enriching Google Ads with custom audiences and offline conversions, Lucid saw a 52% increase in ROAS and a 37% increase in new users.
Data enrichment can also have a huge impact on lifecycle marketing campaigns because it gives you the ability to manage and create customized campaigns across various touchpoints.
For example, if a user adds an item to their cart but never finishes the order, you might want to push an SMS notification to their phone or send them an email letting them know that they never completed their order. With data enrichment, all of this can be done in real-time using the marketing automation platform of your choice.
Customer support teams often have trouble prioritizing tickets, and being able to quickly troubleshoot, debug, and solve problems for your customers can greatly reduce churn.
By enriching your support tools with data directly from your warehouse, your support teams no longer have to wait for that data to be keyed in manually. Without this wait, they can improve ticket prioritization by leveraging historical data to ensure they are focusing their efforts on the tickets that have the highest impact on your underlying business.
For example, by associating company data with individual tickets, your support teams could prioritize tickets based on business tier (e.g. free, professional, enterprise) or even revenue.
Data Enrichment Tools
Now that you have a good understanding of why you should enrich your data, it’s time to look at the available enrichment services and tools in the market to help you do so. When it comes to moving data out of your warehouse and sending it to your downstream go-to-market tools, there are generally three main used in to tackle the enrichment process in the context of a modern data stack:
- Reverse ETL
Customer data platforms (CDPs) give you the ability to consolidate all of your customer data into a centralized marketing platform where you can easily build custom audiences and do identity resolution. The main advantage of CDPs is the fact that they automatically integrate with other 3rd-party APIs, making it very easy to push data into the hands of your business users in their preferred tools.
CDP Architecture Example
All CDPs have a few core problems though. Since you likely already have a data warehouse, adopting a CDP simply creates another source of truth. These platforms force you to pay for an additional storage layer outside your own cloud infrastructure.
They’re also built around rigid data models and this means you can only send data on specific objects like users and accounts. This can be quite problematic if you have unique data models (e.g. workspaces, subscriptions, playlists, artists, etc.)
Integration Platform as a Service (iPaaS) platforms are point-to-point, moving your data from point “A” to “B” (e.g. sending Salesforce Data to Braze for lifecycle marketing.) Since your data is simply being moved from one system to another, it’s nearly impossible to create a 360-degree view of your customer.
All iPaaS tools are based on event triggers. A trigger represents an event that takes place in one of your individual systems (e.g. creating a new deal in Salesforce.) That event is then transmitted to the integration platform through an API call or Webhook that performs predefined actions set in place by the user.
The main drawcard of iPaaS solutions is that they give you the ability to build intuitive workflows. These workflows often become an absolute nightmare to manage and build because they can get extremely complex as you add in various if/then clauses and different dependencies. In many cases, you have to write custom code just to get them operational.
Reverse ETL is a better option for data enrichment because it integrates natively with your data warehouse. With Reverse ETL you can leverage all of the existing data models that live in your data warehouse or write SQL to sync that to your end destination.
Reverse ETL Example
You don’t have to pay for another storage layer or build complicated workflows. With Reverse ETL you simply have to define your data and map it to the appropriate columns and fields in your end destination. Reverse ETL fits nicely alongside your existing data stack because it simply runs on top of your data warehouse, without actually storing your data.
Getting Started With Data Enrichment
While you could write a custom script or download/upload manual CSV files to tackle data enrichment, neither of these solutions are scalable and if you want to build a continuous enrichment loop between your data warehouse and your data sources, you’ll want to leverage a tool that runs in parallel with your warehouse.
As a Data Activation platform powered by Reverse ETL, Hightouch is the easiest way to move data from your warehouse to 100+ different destinations (e.g. Salesforce, Hubspot, Marketo, Google Ads, Facebook, Braze, Iterable, etc.)
How Hightouch Works
The first integration with Hightouch is completely free so that means you can start enriching your data immediately.