Search documentation...

K

Lightning sync engine

The Lightning sync engine is only available on Business Tier plans.

The Lightning sync engine was previously known as warehouse planning. Both terms describe the same feature.

Overview

Hightouch identifies the incremental changes in your data model so that syncs send only the necessary updates to your destinations. For every sync, Hightouch computes what rows to send by pulling all rows within your data model and comparing them with the last sync. This process is called change data capture (CDC) or diffing. You can find more details in the core concepts.

By default, the change data capture computation and file storage happens on Hightouch managed infrastructure. This can be slow for large models that take a long time to query.

Standard sync architecture diagram

Lightning sync engine architecture diagram
With the Lightning sync engine, Hightouch computes change data capture directly in your warehouse. It stores previously synced data into a special schema managed by Hightouch. Customers with extensive datasets requiring high performance syncs typically use the Lightning sync engine.

Hightouch recommends using the Lightning sync engine for supported sources when syncing more than one million rows of data.

Engine comparison

While the Lightning engine is more performant, its setup is more involved than the default standard sync engine. This is because it requires granting write access to your data warehouse.

Standard sync engineLightning sync engine
Location of change data captureHightouch infrastructureWarehouse schemas managed by Hightouch
PerformanceSlowerQuicker
Ease of setupSimplerMore involved
Required warehouse permissionsRead-onlyRead and write
Ability to switchYou can move to the Lightning engine at any timeYou can't move to the standard engine once Lightning is configured

Supported sources

The Lightning sync engine works with the following sources:

  • Snowflake
  • Google BigQuery
  • Amazon Redshift
  • PostgreSQL
  • Databricks

Enable the Lightning sync engine

When first setting up one of the supported sources, Hightouch prompts you to Choose your sync engine.

Choosing your sync engine in the Hightouch UI

You can set up the Lightning sync engine at this point or continue with the standard sync engine and upgrade to the Lightning engine later. Once you've configured the Lightning engine, you can't switch back to the standard engine.

In addition to enabling the Lightning sync engine in the Hightouch UI, you need to:

  1. Grant Hightouch write permission to your data warehouse.
  2. Create the warehouse schemas for Hightouch to use.

Refer to the source-specific in-app instructions on how to complete these steps.

Choosing your sync engine in the Hightouch UI

Migrate from standard sync engine

You can migrate to the Lightning sync engine by going to the Configuration tab of any previously setup supported source.

  1. From the source's Configuration tab, select Lightning sync engine under Choose your sync engine .
  2. (Optional) Enter the override location of the Lightning sync engine cache. You only need to do this if you require the Lightning sync engine to store previously synced records in a different location than the database or project used to run queries. The vast majority of users don't require this.
  3. Follow the source-specific in-app instructions to grant Hightouch write permissions to your warehouse and create the schemas for Hightouch to use.
  4. Click Save changes.
  5. Test your connection to ensure setup is correct.

Choosing your sync engine in the Hightouch UI

If you enable the Lightning sync engine on an existing source, Hightouch migrates any syncs on the standard engine to the Lightning sync engine on the next run. Any previously synced data won't be unnecessarily resynced. This works by writing the sync history into the warehouse.

Migration tips

Follow these tips when performing the migration:

  • The Lightning sync engine requires your primary key to be unique for every model using the source. This is a general requirement for syncs to work well, but it's strictly enforced in the Lightning sync engine. If you're unsure if your primary key is unique, check before migrating.
  • If you have a large number of syncs to migrate, please so that we can help you migrate incrementally.
  • If you hit any errors, you can often fix them with a full resync. This works if your sync is idempotent and there aren't any removals to execute.

Common errors

Primary key is not unique

The Lightning sync engine requires that every row in your model has a unique value for its primary key. This is because the primary key is used to uniquely identify each row, and for example, whether it failed previously.

Models with duplicate primary keys can be deduplicated in SQL with the ROW_NUMBER() function.

For example:

WITH your_data_model AS (
  // copy/paste your model here
)

WITH your_data_model_with_rank AS (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY your_primary_key) as rank FROM your_data_model
)

SELECT * FROM your_data_model_with_rank WHERE rank=1

The preceding example de-duplicates your model by partitioning on the primary key. It chooses an arbitrary record from your duplicates. You can use ORDER BY or WHERE to filter and select which data from the duplicates you want to keep.

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

On this page

OverviewEngine comparisonSupported sourcesEnable the Lightning sync engineMigrate from standard sync engineMigration tipsCommon errorsPrimary key is not unique

Was this page helpful?