Identity resolution

Identity resolution is only available on Business tier plans.

Your customers engage with your business at multiple touchpoints: mobile browsing, online purchases, in-person purchases, and subscription sign-ups, to name a few. Each step can associate multiple identifiers with a single customer. These include their email, phone number, device IDs, anonymous IDs, user IDs, account IDs, etc. Each touchpoint can also attach multiple events to the user: Page view, Support ticket creation, etc. These disordered data points result in fragmented, messy views of your customers.

Disordered data is problematic for a few reasons:

Incomplete profiles make it hard to target ad spend accurately.
Duplicate data can skew your analytics so that you believe you have more customers with a particular property (for example, geographic location) or propensity (for example, for a particular product type) than in reality.
Unattributed events make it challenging to create accurate audiences and correctly correlate events to your attribution model.

Hightouch's identity resolution capabilities help you connect and combine customer identifiers and events into a unified profile of each customer, so you can have a more accurate view of your customers.

Hightouch's identity resolution is warehouse-centric. You send all your cross-platform data to your warehouse, and Hightouch stitches them into one clear and complete customer view called an identity graph. The identity graph is a table that's computed and stored in your own data warehouse, making it more scalable and secure than computing and storing it with a third-party provider.

With identity resolution, you can avoid sending an irrelevant retargeting campaign for a product the user eventually purchased and focus your marketing spend on ads for the next relevant purchase or subscription. The lookup table identity resolution creates can also help you optimize investments made in third-party data.

Use case

Let's walk through a hypothetical e-commerce customer journey to understand some example identifiers and events.

User actions	Identifiers	Events
The user clicks on an ad for your e-commerce store while scrolling through Facebook on their mobile device.	Their `fbclid`, their mobile device's MAID, `device_id_1`	`Page view`
The user anonymously browses the store's catalog on their mobile device.	`anonymous_id_1`, `device_id_1`	`Page view`s for various product detail pages
On one product detail page, the user adds an item to their cart, but then drops off for some reason.	`anonymous_id_1`, `device_id_1`	`Add to cart`
Later, the user returns anonymously to the site on their laptop and adds the same item to their cart.	`anonymous_id_2`, `device_id_2`	`Add to cart`
This time, the user completes checkout and, in the process, provides email, address, and phone number, but doesn't create an account.	email, address, phone number	`Check out`
After some time, the user files a support ticket about their purchase from their mobile device.	email, `device_id_2`, Zendesk User ID	`Support ticket creation`
Later, the user makes another purchase, but this time decides to create an account.	email, Account ID	`Add to cart`, `Check out`, `Account creation`

Throughout the journey, the user has multiple anonymous IDs, user IDs, device IDs, and events associated with them. Each event also has essential information, including timestamps and other details, such as information about the products they're interested in.

All these data points need to be reconciled with their PII once they complete an order and create an account.

Identity resolution helps merge all these into a unified profile by writing an identity graph that maps them all together.

Configuration overview

You can create multiple identity graphs within Hightouch.

This allows you to test new resolution models with different rules and create graphs for multiple entity types beyond identities, such as households or business accounts.

Identity resolution projects overview page in Hightouch

When creating a new graph, you begin by selecting a source to pull data from.

Note that the source needs to have Lightning sync engine enabled. See this section for more detail.

Then, you enter the schema and table names you want to write the identity graph to.

You can configure specific resolution rules differently in each project. There are two main phases to configure these rules:

Model configuration
Identifier rules

Model configuration

Hightouch uses columns from your models to search for matching records across data sets. The process of connecting different columns in different models to a central identity is called identifier selection.

Since different columns may refer to the same identifier but have different names (for example, anon_id vs. anonymous_id), you'll perform identifier selection to explicitly define each column's identifier.

To configure your identifiers, you need to:

Select which models Hightouch should reference and whether they contain user objects or events
- Incremental IDR runs on event models only match new events since the last run, but each of those new events get compared to all historical events.
Select which model columns to use and what types of identifiers they contain

identifier selection in the Hightouch UI

Check out the model configuration page for implementation details.

Identifier rules

Hightouch lets you control how different rows should be stitched together based on identifier rules.

There are two types of rules you need to create:

Merge rules tell Hightouch which rows should belong to the same profile.
Limit rules specify how many unique identifier values are allowed per profile. If the max is exceeded, then the rows or profiles aren't merged or extra identifier values are discarded depending on if the limit is on a higher or lower priority identifier respectively.

Check out the identifier rules page for implementation details.

Running your graph

After creating your graph, you can run it in two ways:

manually, via the Hightouch UI
via the API

Using your identity graph

Whenever an identity resolution project updates, it writes to one or more output tables that you specified within your data warehouse. You can run SQL queries on this table to generate parent or related models to use in Customer Studio. To learn more, check out the usage page.

FAQs

Can I create an identity graph from multiple data sources?

No, each identity graph can only pull from one data source at a time.

What is the "Hightouch ID" (`ht_id`)?

The Hightouch ID is a unique ID generated for each record when performing identity resolution. For example, a user and all their events they performed would be assigned the same Hightouch ID. If multiple rows from your models are assigned to the same identity, they will all have the same Hightouch ID.

The ht_id has no inherent meaning in relation to the match rules or identifier values associated with the identity. Technically, we generate these by assigning a unique, auto-incrementing number to each input row that goes into IDR. The lowest numbered record for each identity (a cluster of records) is used as the ht_id for that identity. This way, when new records are added to the identity, the ht_id does not change.

Is the Hightouch ID consistent every time the graph runs?

HT ID is consistent between incremental runs except in cases where two or more existing profiles get matched due to new data showing that they share identifier values. In that case, only one of those HT IDs will remain and the others will collapse onto it.

Between full re-runs, HT IDs are not guaranteed to be consistent because full re-runs reprocess all data from an empty state.

What permissions does identity resolution require?

Hightouch performs the identity resolution in your warehouse, meaning your data never leaves your infrastructure. To do this, Hightouch requires permission to create tables in your warehouse. Identity resolution uses the same schema as the lightning engine, so follow those instructions to ensure Hightouch has permissions to access the hightouch_planner schema.

How long does the identity resolution process take?

The identity resolution process depends on how much data you have, how complex your rules are, and how much compute is available in your warehouse. It's difficult to estimate how long it can take, but Hightouch incremental runs after the first run or a full re-run will be much faster than an incremental run.

This is because Hightouch doesn't recompute the entire graph on incremental runs. Instead, we only recompute newly added data, which makes incremental runs faster and more efficient.

If you wish to re-run the entire graph, you may do so by clicking "Rerun from start."

What happens if I delete a row from an input model?

Identity graph runs do not detect when rows are deleted. Runs pull in new or updated rows and matches them against profiles in the identity graph, so data from historical runs that were incorporated into the graph remains in the graph on normal/incremental runs.

To remove data from the graph such as a deleted row, the identity graph must be manually rerun from start.

If you make any changes to your merge rules, limit rules or input models, Hightouch automatically performs a full re-run of your graph.