Search documentation...

K
ChangelogBook a demoSign up

Storage

Hightouch needs to store some data to power your syncs and to power features like the sync debugger. You have full control over where this data is stored.

If you prefer more direct control or have specific security concerns, you can configure Hightouch to store this data in your own Amazon S3, Google Cloud Storage, or Azure Blob Storage account. If you prefer convenience, Hightouch can store this data in our own secure, encrypted infrastructure.

What data does Hightouch store?

Hightouch stores data at-rest for two purposes:

  1. Change data capture
  2. In-app observability and debugging

Depending on your privacy and compliance needs, you can configure Hightouch to store all data-at-rest within your Virtual Private Cloud (VPC), or in a secure, encrypted bucket.

Change data capture

To prevent making excessive API requests and send only necessary updates to your destinations, Hightouch uses a process called change data capture (CDC) or diffing.

In this process, Hightouch stores query results and execution plans after each sync run. When the next sync run occurs, Hightouch uses these diff files to determine incremental changes that should be sent downstream.

By default, Hightouch stores the diff files in a secure encrypted bucket hosted by Hightouch. Business tier accounts can use the Lightning sync engine to compute and stores diffs directly within their data warehouse.

To learn more about CDC, check out the core concepts docs.

Observability and debugging

In addition to storing previous query results, Hightouch stores row-level log metadata including successes and failures, operations performed, and API request and response payloads. This data powers the in-app debugger and can be stored either in your VPC or Hightouch's encrypted bucket.

Managed storage

If you're on a Free, Starter, or Pro plan Hightouch stores data-at-rest in a secure, encrypted, Hightouch-managed bucket. For workspaces running in a Hightouch AWS region, this is an Amazon S3 bucket. For workspaces running in a Hightouch Google Cloud region, this is a Google Cloud Storage bucket. No matter your region, all data in Hightouch managed buckets is encrypted at rest. If you require data-at-rest to live entirely in your VPC, see self-hosted storage.

Data retention

Data automatically expires from Hightouch-managed buckets after 30 days. If change data capture is done in Hightouch-managed buckets, syncs that have not run in over 30 days will require a Full Resync or Reset CDC sync since Hightouch depends on diffing files to detect changes in the data model.

Self-hosted storage

Business tier customers can configure Hightouch to store all customer data-at-rest within their own external storage bucket or blob. Hightouch integrates with these cloud storage providers:

  • Amazon S3
  • Google Cloud Storage (GCS)
  • Microsoft Azure Blob Storage

If you choose to self-host your storage, Hightouch only processes data-in-transit. You can select any supported storage provider to store your data, regardless of your Hightouch region.

When hosting your own storage, Hightouch places full control over object lifecycle, security, and expiration into your hands. We don't expire objects automatically or change your object encryption settings. Ensure that you've configured object expiration, encryption, and access control settings according to your needs.

Setting up self-hosted storage disrupts the change data capture process for active syncs. To reset it, after you've configured self-hosted storage, you need to trigger a full resync or reset cdc sync for all existing syncs that previously ran with Hightouch-managed storage. Make sure that all your syncs satisfy the full resync prerequisites before setting up self-hosted storage. Don't hesitate to if you have any doubts or concerns.

Once you've run a sync after setting up a custom storage bucket, you can't make further changes to your storage configuration, including disabling it. This is because changing your storage configuration is disruptive to Hightouch syncs. If you need to make such a change, please .

Amazon S3

Before getting started, connect Hightouch with your Amazon Web Services account.

Create your S3 bucket

In Amazon S3, create your bucket. We recommend the name <company>-hightouch.

Make sure to:

  • Block all public access to the bucket.
  • Enable Amazon S3 key encryption (SSE-S3). If using SSE-KMS for encryption, you may need to update your IAM policies to grant Hightouch access.
  • Disable bucket versioning.
  • Configure your bucket object lifecycle, to enhance security and cut down on costs.

Authenticate Hightouch with AWS

Hightouch supports authenticating with AWS using Cross-account roles (via STS AssumeRole), or with an Access Key ID / Secret Access Key that you provide. We strongly encourage you to use Cross-account roles, as it doesn't require Hightouch to hold any of your secrets.

To set up your Hightouch AWS credential, follow our connection instructions.

Hightouch needs the following IAM actions to store and retrieve items from your bucket:

ActionDetails
s3:GetObjectGrants permission to retrieve objects from Amazon S3
s3:PutObjectGrants permission to add an object to a bucket
s3:ListBucketGrants permission to list some or all the objects in an Amazon S3 bucket (up to 1000)

You can use the following JSON sample to create your IAM policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Sample",
            "Effect": "Allow",
            "Action": [
              "s3:PutObject",
              "s3:GetObject",
              "s3:ListBucket"
            ],
            "Resource": [
              "arn:aws:s3:::${bucketName}/*",
              "arn:aws:s3:::${bucketName}"
            ]
        }
    ]
}

Configure your bucket in Hightouch

  1. In Hightouch, on the Storage tab of the Settings page, select Amazon S3 as the Cloud provider.
  2. Select your AWS region, enter your Bucket name, and select the AWS credentials you previously set up.
  3. Click Save.

Hightouch app screenshot

Once you save your settings, your new syncs automatically start using your bucket. Run a few syncs and visit your S3 bucket to check files are saving there. Don't hesitate to if you have any questions.

Common errors

If you receive a CredentialsError: Missing credentials in configuration error, refer to the troubleshooting tips in the AWS integration docs.

Google Cloud Storage

Before getting started, connect Hightouch with your Google Cloud account.

Create a bucket

In the Google Cloud console, create a new bucket. We recommend the name <company>-hightouch-bucket. Copy the bucket name and save it for later. Configure your bucket object lifecycle, to enhance security and cut down on costs.

Authenticate with Google Cloud

Hightouch supports authenticating with GCP using Hightouch-managed service accounts, or by using a service account that you control.

To set up your Hightouch GCP credential, follow our connection instructions.

Hightouch needs the following IAM permissions to store and retrieve items from your bucket:

PermissionDetails
storage.objects.listGrants access to view objects and their metadata, excluding ACLs. Can also list the objects in a bucket.
storage.objects.createGrants permission to create, replace, and delete objects; list objects in a bucket; read object metadata when listing (excluding IAM policies); and read bucket metadata, excluding IAM policies.
storage.objects.getGrants access to view objects and their metadata, excluding ACLs. Can also list the objects in a bucket.
storage.objects.deleteGrants permission to overwrite existing objects.

Configure your bucket in Hightouch

  1. In Hightouch, on the Storage tab of the Settings page, select Google Cloud Storage as the Cloud provider.
  2. Enter the Project ID and Bucket name and select the Google Cloud credentials you previously set up.
  3. Click Save.

Hightouch app screenshot

Once you save your settings, your new syncs automatically start using your bucket. Run a few syncs and visit your Google Cloud bucket to check files are saving there. Don't hesitate to if you have any questions.

Microsoft Azure Blob Storage

Before getting started, connect Hightouch with your Azure account.

Create a container

In your Azure portal, create a container. We recommend the name <company>-hightouch-container. Copy the container name and save it for later. Configure your storage lifecycle, to enhance security and cut down on costs.

Authenticate with Azure

The easiest way to grant Hightouch access is to grant the app the Storage Blob Contributor role for the storage account. Alternatively, you may grant only the Storage Blob Delegator role at the account level, and Storage Blob Contributor for the storage container.

If you want to create a custom role with more granular permissions, Hightouch needs the following IAM permissions to store and retrieve items from your Blob Storage Container:

  • "actions"."Microsoft.Storage/storageAccounts/blobServices/containers/read"
  • "dataActions"."Microsoft.Storage/storageAccounts/blobServices/containers/blobs/add/action"
  • "dataActions"."Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write"

Some functionality may require delegation to be granted on the storage Blob Storage Account

  • "actions"."Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action"

Configure your container in Hightouch

  1. In Hightouch, on the Storage tab of the Settings page, select Microsoft Azure as the Cloud provider.
  2. Enter the Account name and Container name from the Azure portal (see connecting Hightouch with your Azure account), and select the Azure credentials you previously set up.
  3. Click Save.

Hightouch app screenshot

Once you save your settings, your new syncs automatically start using your container. Run a few syncs and visit your Blob Storage container to check files are saving there. Don't hesitate to if you have any questions.

Common errors

Your Azure storage configuration may fail to save for a few reasons:

  • invalid_client: Check that you've provided your client secret value, not client secret ID. See the configuration docs for more information.

Azure Portal screenshot

  • This request is not authorized to perform this operation: Check that Hightouch has adequate access to your storage account. If you've configured your storage account's Network access to be from selected virtual networks and IP addresses, be sure to add Hightouch's IP addresses to your network rules.

Azure Portal screenshot

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

Last updated: Jul 22, 2024

On this page

What data does Hightouch store?Change data captureObservability and debuggingManaged storageData retentionSelf-hosted storageAmazon S3Google Cloud StorageMicrosoft Azure Blob Storage

Was this page helpful?