Search documentation...

K

Google Cloud Functions

Build your own custom destination using a serverless function

Overview

This destination invokes a Google Cloud Function whenever data changes in your model. It makes it possible to build your own custom integrations that execute code written in Javascrip, Java, Python, Go, Ruby and PHP. See how to write HTTP function in Google Cloud Functions for more details.

This destination was designed to be as flexible as possible. You can exercise granular control over function triggers, batching, rate limits, concurrency limits, and even error handling. Together, these features let you integrate Hightouch with any internal tool or third-party API.

Under the hood, the Google Cloud Functions destination gives you Hightouch's powerful sync engine, so you continue to benefit from the security and observability features available in our native SaaS destinations.

Example use cases include:

  • Syncing data to web APIs not yet natively supported
  • Enriching data using external sources like Clearbit and ZoomInfo
  • Transforming and filtering data using code instead of SQL

Getting started

Connect to your Google Cloud Functions

When setting up the Google Cloud Functions destination for the first time, you need to fill out the following:

  • GCP credentials for authenticating you function invocations.
  • ProjectID of the GCP project where your function lives.
  • Region for the location of your function.
  • The function name of the function you want to invoke when sync runs.

Hightouch recommends that you follow the principle of least privilege when providing permissions with your service accounts. This destination only requires a cloudfunctions.invoker role for the resource pointing to the function.

Syncing data

Once you've completed setup for the Google Cloud Functions destination in your Hightouch workspace, the next step is to configure a sync that invokes your function whenever rows are added, changed, or removed in your model.

Choose your function triggers

Hightouch monitors your data model for added, changed, and/or removed rows. In this step, you specify which of these events should invoke your function.

Declaring which events should invoke the Google Cloud Function in Hightouch

Suppose you want to use Google Cloud Functions to send a confirmation email whenever a customer places a new order or updates an existing order. Your model might look something like SELECT * FROM orders. The Google Cloud Functions should be triggered whenever rows are added—for example, when customers place new orders—or whenever rows change—when orders are updated. Therefore, you would want to enable the Rows added and Rows changed triggers. When invoking your function, Hightouch passes along metadata about why the function was invoked.

All function invocations are synchronous, meaning that Hightouch invokes your Google Cloud Function and waits for its response.

Configure initial sync behavior

In this step, you tell Hightouch how to handle rows present in your model results during the first sync run.

Certain workflows, such as hydrating a CRM for the first time, may require performing a backfill of all rows during the initial sync. For other use cases, such as sending confirmation emails, you might only want to invoke your function in response to future data changes.

Declaring how to handle initial sync behavior in Hightouch

Configure batch size

By default, Hightouch separately invokes your Google Cloud Function for each added, changed, or removed row.

Certain high-throughput use cases may require batching together multiple rows in the payload. Hightouch supports batches of up to 1000 rows per function invocation.

Declaring batch size in Hightouch

Batching isn't recommended unless absolutely necessary for performance reasons. Enabling this feature requires your Google Cloud Function to be either idempotent or significantly more fault tolerant in the event of a partial batch failure.

Configure rate and concurrency limits

In this step, you declare whether function invocations should be throttled.

To determine the ideal limits, you should consider whether your function interacts with any downstream services that have limits of their own.

Most modern web APIs enforce rate limits, which set a maximum allowed number of requests per second, minute, or hour. Occasionally, APIs may also have concurrency limits, which set a maximum allowed number of requests that can be processed simultaneously. Rate limits and concurrency limits both affect overall sync speed.

Configuring rate and concurrency limits in Hightouch

Configure error handling

Google Cloud Functions can fail for many reasons. Hightouch's retry logic for this destination doesn't discriminate between invocation errors, for example, missing permissions, runtime errors, for example, due to syntax issues, or function errors, for example, due to uncaught exceptions. Hightouch retries all errors eventually.

In this step, you decide whether errors should be retried immediately or during the next sync run. If you choose to retry immediately, you can specify how many retries should be attempted during the sync run. If all these retries fail, Hightouch retires the request during subsequent sync runs until it succeeds.

If an error is gracefully caught in your function, you can flag it as a rejected row and provide a custom error message using the response format specified in the error handling section.

Error handling declaration

You can also include retry logic inside of your Google Cloud Function. After exhausting all retry attempts during a function invocation, respond with an error so that Hightouch knows to reinvoke the function later.

Invocation payload schema

The payload, also known as the "event document," contains your row data in JSON format, along with additional metadata. When writing your Google Cloud Function, you can assume that all function invocations include a payload following this schema:

{
    "operation": XXXXX,
    "primary_key_column": XXXXX,
    "rows": [
        {
            XXXXXX
        }
    ],
    "metadata": {
        "api_version": XXXXX,
        "sync_id": XXXXX,
        "sync_run_id": XXXXX
    }
}

Invocation payload properties

operation

The row operation, also known as the trigger, explains why your function was invoked.

Its possible values are these strings: "add", "change", and "remove".

primary_key_column

This string refers to the column name of the primary key for your Hightouch model.

rows

This is an array of objects, with each object containing row data from your Hightouch model.

metadata

This is an object containing three key/value pairs:

  • api_version (integer): currently set to 1, this identifier only changes if the payload schema is modified in the future
  • sync_id (integer): this is the unique identifier for the Hightouch sync configuration associated with the function invocation
  • sync_run_id (integer): this is the unique identifier for the specific sync run associated with the function invocation

Example invocation payload

Suppose a new row is added to your model:

customer_idemail_addressfirst_namelast_name
928713john.doe@example.comJohnDoe

The payload for the function invocation would look like this:

{
    "operation": "add",
    "primary_key_column": "customer_id",
    "rows": [
        {
            "customer_id": "928713",
            "email": "john.doe@example.com",
            "first_name": "John",
            "last_name": "Doe"
        }
    ],
    "metadata": {
        "api_version": 1,
        "sync_id": XXXXX,
        "sync_run_id": XXXXX
    }
}

Note that rows is an array containing a single element.

Example invocation payload for batching

Suppose three new rows are added to your model:

customer_idemail_addressfirst_namelast_name
928713alice.doe@example.comAliceDoe
283743bob.doe@example.comBobDoe
162352carol.doe@example.comCarolDoe

The payload for the function invocation would look like this:

{
    "operation": "add",
    "primary_key_column": "customer_id",
    "rows": [
        {
            "customer_id": "928713",
            "email": "alice.doe@example.com",
            "first_name": "Alice",
            "last_name": "Doe"
        },
        {
            "customer_id": "283743",
            "email": "bob.doe@example.com",
            "first_name": "Bob",
            "last_name": "Doe"
        },
        {
            "customer_id": "162352",
            "email": "carol.doe@example.com",
            "first_name": "Carol",
            "last_name": "Doe"
        }
    ],
    "metadata": {
        "api_version": 1,
        "sync_id": XXXXX,
        "sync_run_id": XXXXX
    }
}

When batching is enabled, rows sharing a common operation type—add, change, remove—are batched together. Function invocations always represent exactly one operation type.

Response payload schema

If your Google Cloud Function fails to process any rows, it can respond with an array of "rejected rows" that encountered errors. Each rejected row should be identified by its primary key/value and may be associated with an optional error message. Hightouch retries these rows.

If there is an error, the response payload looks like this:

{
    "errors": [
        {
            "primary_key_value": XXXXX,
            "reason": XXXXX
        }
    ]
}

Response payload properties

Errors in the errors array have these properties:

primary_key_value

This identifies the rejected row that needs to be retried.

reason (optional)

This string may represent an error message or any other information that would be helpful for debugging and monitoring sync health.

Example success response payload

If all rows are processed successfully, your function responds with an empty payload.

{}

Example response payload for rejected rows

Suppose three new rows are added to your model:

customer_idemail_addressfirst_namelast_name
928713alice.doe@example.comAliceDoe
283743bob.doe@example.comBobDoe
162352carol.doe@example.comCarolDoe

While processing these rows, your Google Cloud Function encounters two errors:

  • Alice's email can't be found in the downstream service
  • Bob's email already exists in the downstream service

To surface these errors in Hightouch, your function should respond like this:

{
    "errors": [
        {
            "primary_key_value": "928713",
            "reason": "Email not found"
        },
        {
            "primary_key_value": "283743",
            "reason": "Email already exists"
        }
    ]
}

The primary key values (928713 and 283743) refer to customer_id values. The customer_id column was designated the primary_key_column in the function invocation payload.

Example response payload for other errors

If your function experiences an invocation error, for example, exceeding Google Cloud Functions rate limits, runtime error, for example, due to syntax issues, or function error, for example due to uncaught exceptions, all rows in the batch are automatically marked as rejected rows.

Tips and troubleshooting

Common errors

To date, our customers haven't experienced any errors while using this destination. If you run into any issues, . We are here to help.

Live debugger

Hightouch provides complete visibility into the API calls made during each of your sync runs. We recommend reading our article on debugging tips and tricks to learn more.

Sync alerts

Hightouch can alert you of sync issues via Slack, PagerDuty, SMS, or email. For details, please visit our article on alerting.

Ready to get started?

Jump right in or a book a demo. Your first destination is always free.

Book a demoSign upBook a demo

Need help?

Our team is relentlessly focused on your success. Don't hesitate to reach out!

Feature requests?

We'd love to hear your suggestions for integrations and other features.

On this page

OverviewGetting startedConnect to your Google Cloud FunctionsSyncing dataChoose your function triggersConfigure initial sync behaviorConfigure batch sizeConfigure rate and concurrency limitsConfigure error handlingInvocation payload schemaResponse payload schemaTips and troubleshootingCommon errorsLive debuggerSync alerts

Was this page helpful?