Search documentation...

K

Storage

Learn more about what data Hightouch stores, whether with Hightouch or your own infrastructure, to power your syncs.


What data does Hightouch store?

Data needs to be stored at-rest for two purposes:

  1. Change Data Capture & Diffing
  2. In-app Observability & Debugging

Depending on your privacy and compliance needs, Hightouch can be configured to store all data-at-rest within your Virtual Private Cloud (VPC), or in a secure, encrypted bucket hosted by Hightouch.

  • If you’re on the Free, Starter, or Pro tier, this bucket will be hosted by Hightouch. For more info, see Managed by Hightouch.
  • If you’re on the Business Tier, you can self-host in your own infrastructure (Amazon S3 or Google Cloud Storage). For more info, see Managed by customer.

Change Data Capture & Diffing

After each sync, Hightouch stores query results and execution plans. When the next sync runs, Hightouch will use these previous sync files to determine incremental changes that should be sent downstream. These changes fall within three operation categories: added, changed, or removed.

Note: Hightouch also offers warehouse planning (on all tiers) which calculates this diffs directly within your data warehouse. Choosing this option will depend on your performance requirements (especially for larger syncs) and where you want the compute performed. Thanks to warehouse planning, Hightouch can either have write or read-only access to your warehouse with no loss in diffing functionality.

Learn more about how Change Data Capture & Diffing works in the Hightouch Core Concepts.


Observability & Debugger

In addition to storing previous query results, Hightouch will also store row-level log metadata including success & failures, operations performed, and API request & response payloads. This data powers the in-app debugger and can be stored either in your VPC or Hightouch’s encrypted bucket.


Managed by Hightouch

If you’re on the Free, Starter, or Pro tier, query data-at-rest used to power Hightouch is stored in a secure, encrypted bucket managed by Hightouch. Data at-rest used for Change Data Capture & Diffing can be configured in your warehouse via warehouse planning; data powering the in-app debugger will rest in Hightouch’s infrastructure. If you require data-at-rest to live entirely in your VPC, see Managed by customer.

Data Retention

Data is automatically expired from Hightouch-managed buckets after 30 days.

If Change Data Capture & Diffing is done in Hightouch-managed buckets, syncs that have not run in over 30 days will require a Full Resync since Hightouch depends on diffing files to detect changes in the data model.


Managed by customer

Business Tier customers can configure Hightouch to store all customer data-at-rest within your own external storage bucket, hosted in your Amazon S3 or Google Cloud Storage account. Doing so enables Hightouch to only process data-in-transit. Hightouch will use this bucket to power its core functionalities.

When using a customer-managed storage bucket, Hightouch places full control over object lifecycle, security, and expiration into your hands. We will not expire objects automatically, or modify your object encryption settings. Ensure that you've configured object expiration, encryption, and access control settings according to your needs.

If you've already run a sync after setting up a custom storage bucket, you will be unable to make further changes to your storage config. This is because changing your external storage configuration is disruptive to Hightouch syncs. If you need to make such a change, please reach out to customer support.

Amazon S3

Before getting started, connect Hightouch with your Amazon Web Services account.

Create your S3 bucket

In Amazon S3, create your bucket. We recommend the name <company>-hightouch

Make sure to:

  • Block all public access to the bucket.
  • Enable Amazon S3 key encryption (SSE-S3).
  • Disable bucket versioning.
  • Configure your bucket object lifecycle, to enhance security and cut down on costs.

Authenticate Hightouch with AWS

Hightouch supports authenticating with AWS using Cross-account roles (via STS AssumeRole), or with an Access Key ID / Secret Access Key that you provide. We strongly encourage you to use Cross-account roles, as it does not require Hightouch to hold any of your secrets.

To set up your Hightouch AWS credential, follow the documentation here.

Hightouch needs the following IAM actions to store and retrieve items from your bucket:

ActionDetails
s3:GetObjectGrants permission to retrieve objects from Amazon S3
s3:PutObjectGrants permission to add an object to a bucket
s3:ListBucketGrants permission to list some or all of the objects in an Amazon S3 bucket (up to 1000)

Configure your bucket in Hightouch

Access the external bucket settings under Settings > Storage.

Select your AWS region, enter your bucket name, and select the AWS credentials you set up on step 2.

Once you save your settings, your new syncs will automatically start using your bucket.

Run a sync to test it out!


Google Cloud Storage

Before getting started, connect Hightouch with your Google Cloud account.

Create a bucket

We recommend the name <company>-hightouch-bucket. Copy the bucket name and save it for later. Configure your bucket object lifecycle, to enhance security and cut down on costs.

Authenticate Hightouch with Google Cloud

Hightouch supports authenticating with GCP using Hightouch-managed service accounts, or by using a service account that you control.

To set up your Hightouch GCP credential, follow the documentation here.

Hightouch needs the following IAM permissions to store and retrieve items from your bucket:

PermissionDetails
storage.objects.listGrants access to view objects and their metadata, excluding ACLs. Can also list the objects in a bucket.
storage.objects.createGrants permission to create, replace, and delete objects; list objects in a bucket; read object metadata when listing (excluding IAM policies); and read bucket metadata, excluding IAM policies.
storage.objects.get* Grants access to view objects and their metadata, excluding ACLs. Can also list the objects in a bucket.

Enter configuration details in Hightouch

Back in Hightouch, under Settings > Storage, enter the project name and bucket name. Select the GCP credentials you set up in Step 2.

Don't forget to click 'save'.

Testing

After you've saved your Google Cloud bucket settings in the external storage area in Hightouch, run a few syncs and visit your Google Cloud bucket to see the files that are saved there. Please contact us if you have any trouble.

    Need help?

    Our team is relentlessly focused on your success. We're ready to jump on a call to help unblock you.

    • Connection issues with your data warehouse?
    • Confusing API responses from destination systems?
    • Unsupported destination objects or modes?
    • Help with complex SQL queries?

    Feature Requests?

    If you see something that's missing from our app, let us know and we'll work with you to build it!

    We want to hear your suggestions for new sources, destinations, and other features that would help you activate your data.

On this page

What data does Hightouch store?Change Data Capture & DiffingObservability & DebuggerManaged by HightouchData RetentionManaged by customerAmazon S3Google Cloud Storage

Was this page helpful?