After each sync, Hightouch stores query results and execution plans. When the next sync runs, Hightouch will use these previous sync files to determine incremental changes that should be sent downstream. These changes fall within three operation categories: added, changed, or removed.
Note: Hightouch also offers warehouse planning (on all tiers) which calculates this diffs directly within your data warehouse. Choosing this option will depend on your performance requirements (especially for larger syncs) and where you want the compute performed. Thanks to warehouse planning, Hightouch can either have write or read-only access to your warehouse with no loss in diffing functionality.
In addition to storing previous query results, Hightouch will also store row-level log metadata including success & failures, operations performed, and API request & response payloads. This data powers the in-app debugger and can be stored either in your VPC or Hightouch’s encrypted bucket.
If you’re on the Free, Starter, or Pro tier, query data-at-rest used to power Hightouch is stored in a secure, encrypted bucket managed by Hightouch. Data at-rest used for Change Data Capture & Diffing can be configured in your warehouse via warehouse planning; data powering the in-app debugger will rest in Hightouch’s infrastructure. If you require data-at-rest to live entirely in your VPC, see Managed by customer.
Data is automatically expired from Hightouch-managed buckets after 30 days.
If Change Data Capture & Diffing is done in Hightouch-managed buckets, syncs that have not run in over 30 days will require a Full Resync since Hightouch depends on diffing files to detect changes in the data model.
Business Tier customers can configure Hightouch to store all customer data-at-rest within your own external storage bucket, hosted in your Amazon S3 or Google Cloud Storage account. Doing so enables Hightouch to only process data-in-transit. Hightouch will use this bucket to power its core functionalities.
When using a customer-managed storage bucket, Hightouch places full control over object lifecycle, security, and expiration into your hands. We will not expire objects automatically, or modify your object encryption settings. Ensure that you've configured object expiration, encryption, and access control settings according to your needs.
If you've already run a sync after setting up a custom storage bucket, you will be unable to make further changes to your storage config. This is because changing your external storage configuration is disruptive to Hightouch syncs. If you need to make such a change, please reach out to customer support.
Hightouch supports authenticating with AWS using Cross-account roles (via STS AssumeRole), or with an Access Key ID / Secret Access Key that you provide. We strongly encourage you to use Cross-account roles, as it does not require Hightouch to hold any of your secrets.
To set up your Hightouch AWS credential, follow the documentation here.
Hightouch needs the following IAM actions to store and retrieve items from your bucket:
Grants permission to retrieve objects from Amazon S3
Grants permission to add an object to a bucket
Grants permission to list some or all of the objects in an Amazon S3 bucket (up to 1000)
After you've saved your Google Cloud bucket settings in the external storage area in Hightouch, run a few syncs and visit your Google Cloud bucket to see the files that are saved there. Please contact us if you have any trouble.
Our team is relentlessly focused on your success. We're ready to jump on a call to help unblock you.
Connection issues with your data warehouse?
Confusing API responses from destination systems?
Unsupported destination objects or modes?
Help with complex SQL queries?
If you see something that's missing from our app, let us know and we'll work with you to build it!
We want to hear your suggestions for new sources, destinations, and other features that would help you activate your data.