ChangelogBook a demoSign up

Array expansion

Overview

Array expansion lets you take a column that contains an array of values and expand each element into its own row during a sync. This is useful when your source data stores multiple identifiers or values in a single array column, but your destination expects one record per value.

For example, if a user has multiple device IDs stored in an array column, array expansion creates a separate sync row for each device ID, allowing you to sync each one individually to your destination.

Array expansion is currently supported with the lightning sync engine on Postgres, Snowflake, BigQuery, and Databricks sources.

How it works

When you map an array-type column as the identifier in your sync configuration, Hightouch automatically enables array expansion. Each element of the array becomes a separate row, and a generated column is used as the identifier value sent to the destination.

Example

Consider this source table with an array column device_ids:

user_idnamedevice_ids
1Alice["ios_a1", "android_b2"]
2Bob["ios_c3"]
3Charlie[]
4Dana["web_d4", "web_d4"]

When you map device_ids as the identifier for your sync, Hightouch expands each row so that every array element becomes its own record:

_ht_device_ids_elementuser_idname
ios_a11Alice
android_b21Alice
ios_c32Bob
web_d44Dana

The generated column _ht_device_ids_element contains the individual element value and is used as the identifier sent to the destination.

Empty arrays

Rows where the array column is empty ([]) or NULL produce zero output rows. In the example above, Charlie's row is omitted entirely because device_ids is an empty array.

Duplicate elements

If an array contains duplicate values (like Dana's two "web_d4" entries), the duplicates produce a primary key collision. The sync completes with a "Report" status and the duplicate rows are flagged as rejected. Only one row per unique element value is synced.

Configuration

Array expansion is configured automatically when you map an array-type column as the identifier in the "How should rows be matched?" section of your sync configuration.

When an array column is selected:

  1. Hightouch sets the arrayExpansionSourceColumn in your sync config
  2. The identifier mapping is automatically translated to use the generated element column (_ht_{column}_element)
  3. An info banner confirms that array expansion is active

To disable array expansion, change the identifier mapping to a non-array column.

Supported column types

Array expansion works with columns of the following types:

  • Native arrays: Postgres text[], Snowflake ARRAY, BigQuery ARRAY<STRING>, Databricks ARRAY<STRING>
  • JSON arrays: Postgres jsonb, Snowflake VARIANT, BigQuery JSON
  • JSON string arrays: VARCHAR columns containing JSON-encoded arrays like '["a","b"]'

Hightouch automatically detects the column type and applies the appropriate expansion strategy.

Limitations

  • Only one array column can be expanded per sync
  • Array expansion requires the lightning sync engine (in-warehouse planner)
  • Only supported on Postgres, Snowflake, BigQuery, and Databricks sources
  • Duplicate values within an array cause row-level rejections
  • Array expansion is not available on syncs that use Match Booster, as Match Booster applies its own identity resolution logic that is incompatible with row expansion

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

Privacy PolicyTerms of Service

Last updated: Mar 23, 2026

On this page
  • Overview
  • How it works
  • Example
  • Empty arrays
  • Duplicate elements
  • Configuration
  • Supported column types
  • Limitations

Was this page helpful?