Skip to main content
Log inGet a demo

Metadata matters: Building data trust with Databricks Unity Catalog

Business teams need to understand their data to use it effectively. Hightouch’s integration with Databricks Unity Catalog bridges the data-business divide.

Scott Bailey

/

Dec 3, 2024

/

5 minutes

Metadata matters: Building data trust with Databricks Unity Catalog.

In a high-functioning company, data teams curate data assets that are used by disparate teams across marketing, sales, operations, and more. How can they ensure that this data gets used correctly and that the rest of the organization can trust it?

One solution is quite simple: metadata. More specifically, up-to-date metadata, curated by experts, and made available in the environments where business teams work. Metadata like this can create bridges of trust between teams and across purposes, as every team works from the shared foundation of their data team's expertise.

Hightouch's new integration with Databricks Unity Catalog builds this shared knowledge. Data teams annotate data with rich descriptions through Catalog Explorer, and Hightouch ensures that descriptions are available to marketers across every interface in Customer Studio where they action data. Each team works where they're most comfortable, while benefiting from metadata that gives them clarity and confidence.

Unity Catalog is a critical feature that bridges the gap between data teams that curate and govern the data and business teams that use this data via Hightouch. This is yet another example of how Hightouch and Databricks together combine best-in-class governance with unparalleled ease of use so businesses can unlock more value from their data.

Dan Morris

Global Head of Marketing Solutions at Databricks

Why good metadata matters

Every team needs context to use data correctly–even though that context varies based on their goals. Let’s see how marketing, sales, and operations teams might use lifetime value (LTV) data.

  • Marketing needs to understand if an ltv_score_v2 represents a prediction or historical data to build accurate segments.
  • Sales uses the same score to prioritize accounts, but they need to know how current the data is and what factors influence it.
  • Operations needs to understand the score's thresholds and update frequency to automate processes like account tiering.

Clear column descriptions serve as a bridge between the data team’s domain expertise and the business teams’ needs. A well-documented ltv_score_v2 column might explain: "Predicted 12-month customer value using our latest model (updated daily). Includes both subscription and add-on revenue. Scores above 100 indicate enterprise-tier accounts." This single description helps marketing target high-value segments, enables sales to prioritize promising accounts, and allows operations to automate account management—all while working from the same trusted definition.

High-quality metadata doesn't just help human teams—it provides essential semantic context for AI systems and intelligent agents working with your data. For example, applications or agents that use retrieval-augmented generation (RAG) can use metadata to decide which data are included in prompts or directly include metadata in prompts to account for data semantics. When AI solutions understand your data, they can make more informed decisions and generate more value for users across every department.

Why Unity Catalog: Building a foundation of trusted data

Databricks Unity Catalog provides a unified governance layer that centralizes how organizations manage, control, and monitor their data and AI assets. It serves as a single source of truth and enables teams to explore documented data assets confidently, understand data lineage, and access resources through a self-service model.

Unity Catalog’s unified and warehouse/lakehouse-centric approach allows teams to work with trusted data in their own ways. Marketing teams benefit from well-documented customer attributes, developers trace data lineage to ensure proper transformations, and all teams leverage centralized data knowing that access controls protect sensitive information. This unified approach eliminates the risks of outdated or ungoverned data while simplifying the experience for business users.

How data teams set up metadata in Unity Catalog

Data teams are the initial trustees of data quality, responsible for creating reliable, documented data assets that serve multiple downstream needs. Through Unity Catalog's column descriptions (called comments), they can embed their expertise directly into the data infrastructure.

Unity Catalog AI comments

Databricks Catalog Explorer enables table and column-level descriptions.

Data documentation can be a time-consuming process though, and hard to scale depending on an organization’s data velocity and volume. Databricks released AI generated documentation in Unity Catalog to augment the capacities of a data team and enable a broader range of data to be described. This enhanced coverage reduces data ambiguity and helps downstream teams harness data more effectively.

How Hightouch's Unity Catalog integration bridges knowledge between teams

Hightouch's integration with Unity Catalog preserves and extends this chain of trust. When marketers use Customer Studio, they see the same column descriptions their data team created in Unity Catalog. This creates a seamless bridge between data team expertise and marketing team needs.

Metadata descriptions in Customer Studio

Column descriptions are visible during audience creation, enabling marketers to use any column confidently.

When a marketer encounters a column name like ltv_score_v2 when creating an audience, they don't have to guess whether it represents predicted or historical data, or whether it’s updated every day or every week. The column description from Unity Catalog appears right in the Customer Studio interface, providing the information the marketer needs to use the data confidently.

The virtuous circle: Building a culture of trusted data

This integration creates a virtuous circle of trust and documentation. As business teams successfully use well-documented data, they provide feedback that helps data teams improve their documentation further. Marketing teams might request clarification on specific metrics, leading to more precise descriptions that benefit all users.

Looking ahead, this foundation of shared understanding enables new forms of trust and usage. Teams can build more sophisticated audiences, automate more complex processes, and make more confident decisions—all because they deeply understand the data they're working with.

Through the combination of Unity Catalog's governance capabilities and Hightouch's business-friendly interfaces, organizations can build a culture where data is not just trusted but truly understood and activated across all teams.


More on the blog

  • Friends don’t let friends buy a CDP.

    Friends don’t let friends buy a CDP

    How spending the first half of his professional career at Segment drove Tejas Manohar to disrupt the 3.5 billion dollar CDP category.

  • What is a Composable CDP?.

    What is a Composable CDP?

    Learn why Composable CDPs are seeing such rapid adoption, how they work, and why they're replacing traditional CDPs.

  • Introducing AI Decisioning.

    Introducing AI Decisioning

    Our biggest product yet. Test everything, everywhere, all at once—powered by your complete data and AI.

Recognized as an industry leader by industry leaders

Iterable logo.

Technology Partner
of the Year