Hightouch needs to store some data to power your syncs and to power features like the sync debugger. You have full control over where this data is stored.
If you prefer more direct control or have specific security concerns, you can configure Hightouch to store this data in your own Amazon S3, Google Cloud Storage, or Azure Blob Storage account. If you prefer convenience, Hightouch can store this data in our own secure, encrypted infrastructure.
What data does Hightouch store?
Hightouch stores data at-rest for two purposes:
Depending on your privacy and compliance needs, you can configure Hightouch to store all data-at-rest within your Virtual Private Cloud (VPC), or in a secure, encrypted bucket.
- If you're on the Free, Starter, or Pro tier, Hightouch hosts and manages a secure encrypted bucket on your behalf. For more information, see the managed storage section.
- If you're on the Business tier, you can self-host storage in your own infrastructure. For more information, see the self-hosted storage section.
Change data capture
To prevent making excessive API requests and send only necessary updates to your destinations, Hightouch uses a process called change data capture (CDC) or diffing.
In this process, Hightouch stores query results and execution plans after each sync run. When the next sync run occurs, Hightouch uses these diff files to determine incremental changes that should be sent downstream.
By default, Hightouch stores the diff files in a secure encrypted bucket hosted by Hightouch. Business tier accounts can use the Lightning sync engine to compute and stores diffs directly within their data warehouse.
To learn more about CDC, check out the core concepts docs.
Observability and debugging
In addition to storing previous query results, Hightouch stores row-level log metadata including successes and failures, operations performed, and API request and response payloads. This data powers the in-app debugger and can be stored either in your VPC or Hightouch's encrypted bucket.
Managed storage
If you're on a Free, Starter, or Pro plan Hightouch stores data-at-rest in a secure, encrypted, Hightouch-managed bucket. For workspaces running in a Hightouch AWS region, this is an Amazon S3 bucket. For workspaces running in a Hightouch Google Cloud region, this is a Google Cloud Storage bucket. No matter your region, all data in Hightouch managed buckets is encrypted at rest. If you require data-at-rest to live entirely in your VPC, see self-hosted storage.
Data retention
Data automatically expires from Hightouch-managed buckets after 30 days. If change data capture is done in Hightouch-managed buckets, syncs that have not run in over 30 days will require a Full Resync or Reset CDC sync since Hightouch depends on diffing files to detect changes in the data model.
Self-hosted storage
Business tier customers can configure Hightouch to store all customer data-at-rest within their own external storage bucket or blob. Hightouch integrates with these cloud storage providers:
- Amazon S3
- Google Cloud Storage (GCS)
- Microsoft Azure Blob Storage
If you choose to self-host your storage, Hightouch only processes data-in-transit. You can select any supported storage provider to store your data, regardless of your Hightouch region.
When hosting your own storage, Hightouch places full control over object lifecycle, security, and expiration into your hands. We don't expire objects automatically or change your object encryption settings. Ensure that you've configured object expiration, encryption, and access control settings according to your needs.
Setting up self-hosted storage disrupts the change data capture process for active syncs. To reset it, after you've configured self-hosted storage, you need to trigger a full resync or reset cdc sync for all existing syncs that previously ran with Hightouch-managed storage. Make sure that all your syncs satisfy the full resync prerequisites before setting up self-hosted storage. Don't hesitate to if you have any doubts or concerns.
Once you've run a sync after setting up a custom storage bucket, you can't make further changes to your storage configuration, including disabling it. This is because changing your storage configuration is disruptive to Hightouch syncs. If you need to make such a change, please .
Amazon S3
Before getting started, connect Hightouch with your Amazon Web Services account.
Create your S3 bucket
In Amazon S3, create your bucket. We recommend the name <company>-hightouch
.
Make sure to:
- Block all public access to the bucket.
- Enable Amazon S3 key encryption (SSE-S3). If using SSE-KMS for encryption, you may need to update your IAM policies to grant Hightouch access.
- Disable bucket versioning.
- Configure your bucket object lifecycle, to enhance security and cut down on costs.
Authenticate Hightouch with AWS
Hightouch supports authenticating with AWS using Cross-account roles (via STS AssumeRole), or with an Access Key ID / Secret Access Key that you provide. We strongly encourage you to use Cross-account roles, as it doesn't require Hightouch to hold any of your secrets.
To set up your Hightouch AWS credential, follow our connection instructions.
Hightouch needs the following IAM actions to store and retrieve items from your bucket:
Action | Details |
---|---|
s3:GetObject | Grants permission to retrieve objects from Amazon S3 |
s3:PutObject | Grants permission to add an object to a bucket |
s3:ListBucket | Grants permission to list some or all the objects in an Amazon S3 bucket (up to 1000) |
You can use the following JSON sample to create your IAM policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Sample",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::${bucketName}/*",
"arn:aws:s3:::${bucketName}"
]
}
]
}
Configure your bucket in Hightouch
- In Hightouch, on the Storage tab of the Settings page, select Amazon S3 as the Cloud provider.
- Select your AWS region, enter your Bucket name, and select the AWS credentials you previously set up.
- Click Save.
Once you save your settings, your new syncs automatically start using your bucket. Run a few syncs and visit your S3 bucket to check files are saving there. Don't hesitate to if you have any questions.
Common errors
If you receive a CredentialsError: Missing credentials in configuration
error, refer to the troubleshooting tips in the AWS integration docs.
Google Cloud Storage
Before getting started, connect Hightouch with your Google Cloud account.
Create a bucket
In the Google Cloud console, create a new bucket. We recommend the name <company>-hightouch-bucket
. Copy the bucket name and save it for later.
Configure your bucket object lifecycle, to enhance security and cut down on costs.
Authenticate with Google Cloud
Hightouch supports authenticating with GCP using Hightouch-managed service accounts, or by using a service account that you control.
To set up your Hightouch GCP credential, follow our connection instructions.
Hightouch needs the following IAM permissions to store and retrieve items from your bucket:
Permission | Details |
---|---|
storage.objects.list | Grants access to view objects and their metadata, excluding ACLs. Can also list the objects in a bucket. |
storage.objects.create | Grants permission to create, replace, and delete objects; list objects in a bucket; read object metadata when listing (excluding IAM policies); and read bucket metadata, excluding IAM policies. |
storage.objects.get | Grants access to view objects and their metadata, excluding ACLs. Can also list the objects in a bucket. |
storage.objects.delete | Grants permission to overwrite existing objects. |
Configure your bucket in Hightouch
- In Hightouch, on the Storage tab of the Settings page, select Google Cloud Storage as the Cloud provider.
- Enter the Project ID and Bucket name and select the Google Cloud credentials you previously set up.
- Click Save.
Once you save your settings, your new syncs automatically start using your bucket. Run a few syncs and visit your Google Cloud bucket to check files are saving there. Don't hesitate to if you have any questions.
Microsoft Azure Blob Storage
Before getting started, connect Hightouch with your Azure account.
Create a container
In your Azure portal, create a container. We recommend the name <company>-hightouch-container
. Copy the container name and save it for later.
Configure your storage lifecycle, to enhance security and cut down on costs.
Authenticate with Azure
The easiest way to grant Hightouch access is to grant the app the Storage Blob Contributor role for the storage account. Alternatively, you may grant only the Storage Blob Delegator role at the account level, and Storage Blob Contributor for the storage container.
If you want to create a custom role with more granular permissions, Hightouch needs the following IAM permissions to store and retrieve items from your Blob Storage Container:
"actions"."Microsoft.Storage/storageAccounts/blobServices/containers/read"
"dataActions"."Microsoft.Storage/storageAccounts/blobServices/containers/blobs/add/action"
"dataActions"."Microsoft.Storage/storageAccounts/blobServices/containers/blobs/write"
Some functionality may require delegation to be granted on the storage Blob Storage Account
"actions"."Microsoft.Storage/storageAccounts/blobServices/generateUserDelegationKey/action"
Configure your container in Hightouch
- In Hightouch, on the Storage tab of the Settings page, select Microsoft Azure as the Cloud provider.
- Enter the Account name and Container name from the Azure portal (see connecting Hightouch with your Azure account), and select the Azure credentials you previously set up.
- Click Save.
Once you save your settings, your new syncs automatically start using your container. Run a few syncs and visit your Blob Storage container to check files are saving there. Don't hesitate to if you have any questions.
Common errors
Your Azure storage configuration may fail to save for a few reasons:
invalid_client
: Check that you've provided your client secret value, not client secret ID. See the configuration docs for more information.
This request is not authorized to perform this operation
: Check that Hightouch has adequate access to your storage account. If you've configured your storage account's Network access to be from selected virtual networks and IP addresses, be sure to add Hightouch's IP addresses to your network rules.