Announcing Identity Resolution
The warehouse-native identity resolution feature that MarTech teams have been looking for.
July 19, 2023
Today, we’re thrilled to announce our new warehouse-native Identity Resolution feature. With the feature, data and technology teams can now stitch together disparate customer data housed in their data warehouse and create unified 360-degree views of their customers and accounts.
Thanks to investments in cloud data warehouses like Snowflake and Databricks, enterprises today have more data at their fingertips than ever before. Yet, unless they have significant data engineering resources, businesses are still struggling to unify this data into usable profiles. The data is often not ready for activation in marketing campaigns or for analytics and attribution.
As the flagship feature of our newly introduced Customer 360 Toolkit product line, Identity Resolution enables organizations to generate customizable identity graphs in their warehouse without code. The feature is a powerful addition to our "Composable" Customer Data Platform (CDP) offering, allowing enterprises to fully activate their customer data for marketing, analytics, and operations.
Diving Into the Problem
Hightouch is a Data Activation platform. That said, when speaking to customers, we often heard about their struggles to get their data in a state where it could be used for activation. Entity and identity resolution was a key blocker – duplicate records, incomplete profiles, and unattributed events all posed significant downstream business risk.
There was resounding demand for software to complement existing in-house data engineering work to clean up the data warehouse and resolve identities more effectively than inflexible Customer Data Platforms (CDPs). To solve this at scale, we needed to address each of the largest hurdles faced by teams today:
- Building for both event streams and offline data. Many CDPs only collect clickstream event data from web and mobile platforms. Essential alternative information – from SaaS tools, internal databases, and 3rd parties – is not truly supported. We felt strongly about filling this gap. The rich information like customer service tickets, campaign memberships, and partner transaction data that is likely stored in the data warehouse should also be incorporated into the customer 360.
- Transparency and traceability. Identity is the foundation for personalization and customer analytics. Because of its importance, teams should be able to control and deeply understand how their identity resolution works. Results should be explainable, reproducible, and configurable. Many enterprises we’ve talked to prioritized business rules over black-box machine learning wherever possible.
- Complements existing data investments. Much of the work of identity resolution is in cleaning, modeling, and transforming data. Most companies have already invested significant data engineering resources in these processes for analytics use cases. We wanted to complement and build on these investments instead of duplicating them.
The work was cut out for us. How do we create an intuitive interface to express complex business logic that customers themselves can control? How do we let marketers achieve value out of the box, meanwhile scaling up for complex use cases? And most importantly, how do we do this directly within the data warehouse to provide the security and flexibility enterprises need?
Our approach: Configurability Without Complexity
As a pioneer of the “Composable CDP,” we started with the data warehouse – the ultimate source of truth for event streams, transactions, SaaS data, internal databases, and 3rd party data. By directly querying the data warehouse, Hightouch Identity Resolution builds on a much more complete set of customer data than just web and mobile events.
From there, we drew inspiration from typical data engineering workflows. A data pipeline can achieve high configurability without unnecessary complexity by composing together smaller pieces of business logic. Translation: marketers can get their complex needs satisfied without being blocked by limited data engineering resources.
In Hightouch’s Identity Resolution, many components can be combined to scale up to serve enterprise-grade business requirements. Concepts that can be composed include:
- Business rules – multiple resolution rules can be used, from attribution-esque rules to resolve anonymous events to fuzzy algorithms for offline demographic data.
- Data cleaning – a growing library of preprocessing tools, like aliasing names, standardizing address conventions, or normalizing phone numbers.
- Identity graphs – different existing identity graphs – even from other tools – can be used as inputs for Hightouch Identity Resolution.
Underneath the hood, Hightouch Identity Resolution generates custom SQL for this identity pipeline. The code implements optimizations. These include recursion, incrementality, and smart partitioning. We handle the nitty-gritty work so that your data engineering team doesn’t have to spend months on it.
Ultimately, each marketer’s identity resolution use case is unique. The needs differ even between teams in the same company, such as paid media, lifecycle marketing, and customer operations. Hightouch’s approach results in an identity resolution process that can be quickly configured and customized without code.
How Hightouch Identity Resolution Works
The new Identity Resolution feature enables organizations to define logic to create unified customer profiles and associate events with those profiles using a powerful no-code UI. Hightouch writes an identity graph directly into a table in the organization’s data warehouse, which is then available to power downstream use cases. Organizations can set recurring schedules for when Hightouch updates their identity resolution models and monitor performance and outcomes at a glance.
Data Onboarding: Organizations can access different profile and event tables directly from their data warehouse in Hightouch and define key identifiers from within each table. Hightouch also enables teams to define custom models in their data warehouse with SQL or dbt. Whatever data exists in the warehouse is available for identity resolution.
Similarly, profiles are not limited to individual customer identity records. The Customer 360 Toolkit also enables entity resolution. Companies can use this product to unify profiles for any other entity record type, such as households or accounts.
Merge records when: logic to describe when two records may belong together. For example, users can merge two records when
email is an exact match OR ‘full name is an exact match after aliasing AND zip code is a fuzzy match.’
Pause merging if: rules to pause merging if a single profile exceeds predefined limits. This prevents erroneous merges from occurring. For example, if a profile has more than ten emails, then pause the proposed merge.
Resolve conflicts by: methods for resolving conflicts if a merge is paused. For example, we can default to the earliest created profile or retry the merge with a strict priority of identifiers. We support a growing library of modules – by stacking different ways to resolve conflicts, users can express powerful custom logic in a way that’s easy to understand.
Output: Finally, Hightouch writes the results of this unification – an identity graph – directly to a table within the data warehouse. Companies can leverage these lookup tables directly in Hightouch or for analytics, measurement, or other operational use cases. Since these tables are in the warehouse, users have full access and control to determine how to use the output of Identity Resolution.
Interested in solving identity resolution, building customer profiles in your data warehouse, and activating your data to 200+ downstream destinations? Book a demo today to speak to our solutions engineers. We’re here to help you make the most of your data warehouse and put your customer data to work.