A Guide for Evaluating Data Activation Platforms
A simple framework to evaluating and testing Data Activation platforms to ensure successful implementation.
Leonie van der Sleen
May 22, 2023
According to Forrester, as much as 73% of all data within enterprise organizations is never used. The reality is that most of this data just stagnates, unused in a data warehouse or in an analytics dashboard, without ever being leveraged to its full potential.
Inevitably, this problem has led many companies to evaluate Data Activation platforms to push valuable data out of their warehouse into operational tools so their business teams can turn insights into outcomes. In this blog post, I’ll show you exactly how you should go about evaluating a new Data Activation platform and the core factors you need to consider before purchasing a tool.
Why Are You Evaluating Data Activation Platforms?
As a sales engineer at Hightouch, every day, I spend countless hours talking to customers trying to understand their current challenges and accelerate how quickly they’re able to get value out of their data.
Inevitably, every company I talk to has the same problem: they know they want to take action on their data, but they don’t know how. More specifically, every organization has data readily available in their data warehouse, but it’s very difficult to move these insights to business tools across marketing, support, sales, finance, ops, etc.
In my experience, companies adopt Data Activation platforms because they want to align the data team with the center of the business and link their work to tangible business outcomes. Generally, there are three reasons to adopt a Data Activation platform:
- You have valuable customer data in your warehouse that you want to take action on.
- You need a scalable solution to reliably and automatically move data out of your warehouse and sync it to your operational tools.
- You want to enable your business and marketing teams to have direct access to your data warehouse so they can self-serve and build personalized experiences for your customers.
Prior to beginning a POC with any vendor, you need to answer three questions:
1. Where are you coming from? What does your current data stack look like?
2. How will you be using the platform? What use cases are you trying to address?
3. Who will be using the platform? What teams will the platform be serving?
In response, I’ve created a 3-phase evaluation framework you can use to plan your evaluation.
What Should an Evaluation Look Like?
Before you even start evaluating Data Activation platforms, you need to outline some clear parameters to define your timeline, configure your tool, and identify potential use cases.
Defining Your Timeline
Many companies think they need to have long drawn-out trials of the product (also called “proof of concepts” or POCs) to prove value. The reality is Data Activation platforms are simple to use and very fast to set up for one use case at a time, because they simply leverage your existing data infrastructure.
Ultimately, this means you can usually test everything you need within 1-2 weeks. The advantage of data activation is that you can start exactly where you are with your current data. Sure, your data may not be perfect. You may not have a pristine and comprehensive customer 360 established within your warehouse. There’s a common misconception that this must be completed before pursuing any activation use case, however it’s quite the opposite. If you have any insight or useful dashboard within your warehouse, you are ready to activate data and drive outcomes. The key question you should ask is “what is your business problem?”
Outlining Use Cases
It’s impossible to test all of your use cases within a POC, and trying to do so is quite unnecessary. POCs are all about proving technical capability and potential value with as few resources as possible. This is the best way to get buy-in from upper management to tackle your most complex use cases.
When identifying what use case you should tackle in your POC, it’s best to identify one with minimal disruption but high impact. For example, imagine your data team is manually pulling product usage data for your sales team or building target audiences for your marketing team every week. Using a Data Activation platform to automatically sync this data from your warehouse to your CRM on a daily basis will generate huge cost-savings across all teams, enabling all of them to focus on more revenue-facing functions.
Even the best Data Activation platform can’t solve all of your use cases overnight. When prioritizing use cases, you should identify low-hanging fruit that provides immediate and quantifiable value with minimal effort, and you should prioritize your use cases as follows using the following questions:
- What’s the number one blocker for your data team?
- What’s the number one blocker for your business teams?
- What could Team X do if they had the right data in hand?
- What stakeholders will you be serving?
- How many destinations will you be supporting?
- Does the data needed to power your downstream use cases exist?
Creating Testing Criteria
Since all Data Activation platforms address the same problem: sending data from a source to a destination, it’s very hard to do an “apples-to-apples.” because every platform looks and acts very similar at a high level.
I’ve had countless conversations with companies in every industry, and each one goes something like this:
“We set up a free trial with platforms A, B, and C, and they support our sources and destinations. We just need to move data from our warehouse and sync it to Salesforce, so we’re just going to go with the cheapest solution.”
This way of thinking is fundamentally flawed because Data Activation is so much more than simply syncing data and mapping columns to fields. It’s only after the actual implementation that companies begin to identify all of valuable outcomes a mature Data Activation platform can provide. This is one of the core reasons that many companies choose Hightouch.
When testing a Data Activation platform, it’s important to identify your success criteria, and rather than solely basing it on a platform’s ability to sync data, your success should be linked to how efficiently the platform can dynamically serve your teams.
- How can you enable your non-technical users to self-serve?
- How will you address logging, sync failures, alerting?
- What is your plan for version control?
- Is extensibility and custom connectors important to you?
- How many destinations will you be supporting?
Extensibility and Flexibility
On the surface, Data Activation platforms are very similar in terms of what integrations they support when it comes to sources and destinations, so your testing criteria should be thorough and apply directly to your use cases. When activating data at scale, many times the devil is in the details, so we highly recommend having a framework to assess the flexibilty and power of a given integration.
Since Data Activation platforms are designed to sync your warehouse data to downstream destinations, they’re relatively technical in nature, and there are a number of quality-of-life features that data teams realize they need only after implementing and using the platform.
- Debugging: Inevitably, your syncs will fail, and when they do, you need to quickly and easily be able to identify when, where, and why something went wrong. Ensuring your provider has a live debugger built-in to showcase API responses and requests while also providing error messages will eliminate this problem.
- Audit Logs: As more and more users begin to use the platform, it’s important to have complete visibility into user behavior so you can monitor changes as their made in the platform, whether that’s at the sync or the model level. Make sure your provider offers audit logs so you can capture this information and monitor it in real time.
- Sync Logs: Your sync data is just as important as your customer data, so you should treat it as such. You should have the ability to store historical metadata on your syncs directly in your warehouse so you can run analytics on it.
- Users: When running a POC, it’s important to consider who will be using the platform. Depending on your scale, you’ll want to ensure your provider facilitates RBAC (role-based access controls) and LBAC (label-based access controls) so you can easily automate and provision user access.
- Version Control: Version control is a must for any data team. When testing a platform, ensure that it offers bi-directional support for syncs in models via Git so you can approve/monitor changes and also roll back your models and syncs to a previous state when something goes wrong.
- Tooling Support: Many Data Activation platforms don’t actually support or integrate with other modern data stack tools, so when you test a platform, you should also see if it integrates with other tools.
At some point, you’ll reach a certain maturity level where it becomes impossible for your data teams to efficiently scale and build syncs for all your marketing needs. Down the line, you’re going to need a platform that offers a no-code suite of tools so your marketing teams can self-serve using the models your data team has built.
With that in mind, one of the key features you should test for when implementing a new tool is the platform’s ability to provide easy-to-use tooling for your non-technical users who don’t understand SQL.
More specifically, you need to evaluate several factors:
- Audiences: Is there a no-code audience builder for your non-technical users so they can visually define audiences as they need them?
- Traits: How easy is it to create and save new computed fields like last page viewed, abandoned cart, likelihood to purchase, etc.?
- Splits: Can you centrally manage and orchestrate multivariate testing across marketing channels by creating multiple splits in audiences?
- Performance: Does the platform provide insights into how your audiences are performing across marketing channels?
- Overlaps: Can you analyze user characteristics to understand who you’re targeting across audiences?
- Match Rates: Can you enrich your first-party data with additional third-party data from leading providers to improve your match rates?
All too often, I see many companies running into the mistake of including too many aspects that should ideally be left out of the scope of a POC. However, these considerations are relevant if you're actually undergoing an implementation.
Architecturally, all Data Activation platforms use Reverse ETL to read from your warehouse and write the results of those queries to an end destination via a third-party API. Unfortunately, many companies get bogged down by the details of this process, thinking that one platform will be faster than another.
When it comes to syncing speeds, there are only two levers that impact performance: your warehouse configuration and the API limitations of the SaaS tool your provider is writing to–and both of these factors are largely out of the control of a Data Activation vendor.
Data Activation platforms rely on your existing warehouse compute configuration, so sync speeds are directly proportional to the number of resources you’ve allocated to run your workloads. For example, if you’re trying to sync millions of records from Snowflake to a Salesforce, but you’re only using an X-Small warehouse to query those records, your sync will be substantially slower than if you used an X-Large warehouse. Sync speeds are only ever an issue if you’re running massive queries and your warehouse is under-provisioned.
Additionally, since Data Activation is focused on syncing data to downstream SaaS applications via third-party APIs, the platforms are designed to prevent excessive API requests, only sending the most necessary data to your destination. This process is referred to as diffing or change data capture (CDC).
A special diffing file is used to verify what records have already been sent so you don’t have to re-sync or overwrite fields unnecessarily This greatly speeds up sync speeds because instead of sending all 1M records each time, you might only need to send 2520k records for each scheduled sync.
Data Activation platforms refer to this file to identify what has changed in the source data since the last run. By default, most Data Activation platforms store the diff file in their own managed infrastructure. However, you can further optimize for faster throughput and speed by storing this file in your own cloud storage.
Ultimately, you shouldn’t see much difference in performance when it comes to activation. If you do, it either means your database isn’t optimized or your vendor is using an outdated API to sync data.
There are many interesting use cases that come up in a POC because Data Activation directly impacts downstream use cases across business teams. However, it’s important to remember that data activation platforms enable use cases, they don’t entirely control the results of a marketing campaign. We recommend that an evaluation of a Data Activation platform should validate the technology that it provides, and not the additional ~5 steps that may occur downstream for a given use case. For example, Hightouch should be evaluated for the efficient and accurate movement of data to Iterable, but not the actual outcomes of an email campaign this data is powering.
Pricing is always one of those things that’s difficult to talk about because companies want to pay as little as possible while also maximizing their return on investment. When looking at pricing, there are only two factors that matter: transparency and scalability. Many Data Activation platforms charge based on destination fields or customer records.
It can be difficult to estimate and scale costs with either of these models and in many cases, it’s not clear what you’re actually paying for. Integration-based pricing is a much clearer model that gives you full transparency and control over how you manage your costs, ensuring you’re only paying for the destinations you’re using. It’s impossible to know how many fields you’ll need to sync or how many records you’ll need to store (especially since you’re already paying to store that data in your warehouse.) Ultimately, pricing should be future-proof, scalable, and transparent.
The support side is often an under-discussed element in an evaluation. Throughout a guided POC, you usually have a dedicated Solutions Engineer, such as myself, that can help you get up to speed. However, many companies fail to consider continued support after the POC is finished. The support you get from your provider is often a direct reflection of the product. For example, here at Hightouch, we offer local support. This means you have access to relevant support within your time zone.
Security is non-negotiable. Most Data Activation platforms are very similar in this area, meeting the standard requirements around GDPR, HIPAA, and CCPA. All Data Activation platforms work in a very similar manner, storing both diff files and sync logs when you sign up for a self-service plan, so if you want to optimize fully for security, you’ll likely need a business tier. This will allow you to perform Change Data Capture within your own VPC so you can write back all sync logs to your own cloud bucket. With a POC identifying whether or not you can facilitate your downstream use cases (e.g., is data flowing to your destinations reliably and consistently?).
Data Activation is the underpinning technology that powers all of your customer-facing use cases. While no provider is going to magically make all of your problems disappear, your decision should be based on a platform’s ability to address your immediate use cases and scale as your needs change and your organization grows.
This is the core reason I joined Hightouch. Previously, I worked as a Sales Engineer at Snowflake, and I witnessed countless companies make the same mistakes over and over again. While there are a variety of Data Activation companies, Hightouch is uniquely built for both data and marketing teams which means it’s incredibly flexible and scalable. If you’re interested in learning more, you can schedule a demo or create a workspace to get started on your own.