What is mParticle?
Learn all about mParticle's core features, capabilities, and product features.
January 22, 2024
If you work in Martech, then chances are you’re either looking to purchase a Customer Data Platform (CDP) or migrate to a new solution. In either case, there’s a high likelihood that you’ve stumbled across mParticle in your evaluation.
This blog post will teach you everything you need to know about mParticle, including:
- What is mParticle?
- Core Products and Capabilities
- Data Collection
- Data Storage
- Data Modeling
- Audience Management
- Real-time Capabilities
- Reverse ETL
- Pros and Cons
What is mParticle?
mParticle is a traditional Customer Data Platform (CDP) that specializes in mobile event collection, so you can easily capture behavioral events and sync that data to hundreds of destinations without having to manage and maintain custom integrations and APIs.
As a CDP, mParticle offers all of the bells and whistles of many other solutions in the market, including data collection, audience management, personalization, and some other real-time and even AI-based capabilities. The company was founded in 2013 with the goal of helping companies collect and manage data more effectively to drive personalization across channels.
mParticle rose to popularity as one of the earliest CDPs to specialize in mobile apps. In fact, if you look at mParticle’s customer page, you’ll notice a common denominator: many of the use cases focus on in-app personalization. The platform was initially created as an alternative to Segment for enterprise companies who wanted more security, better support, and flexibility. If you’re curious, there’s actually a really interesting thread on Quora from 2016 between the CEO of mParticle and the CEO of Segment comparing and contrasting their solutions.
Core Products and Capabilities
While the mParticle platform is pretty versatile, you can break the company’s product offerings into five core categories.
- Data Connections is the backbone of mParticle’s CDP. This product lets you capture behavioral events from your mobile apps, websites, and servers and then push events directly to your marketing tools.
- Data Quality is the suite of tools that mParticle provides to help you manage what data you collect before you share events with your downstream systems. This feature set enables you to manage the schema structure of your data so you can build a cohesive data strategy.
- Data-Driven Personalization encompasses a variety of self-serve audience management features to help you build audience segments, calculated attributes, and even orchestrate customer journeys across your tools.
- Indicative is an audience analytics tool that mParticle acquired in 2022. This platform lets you perform cohort analysis across different audiences and also measure other key metrics like conversions over time. You can also further segment your users by specific events they performed to better understand exactly where they are in the customer journey.
- Cortex is an AI prediction tool that mParticle acquired through a company called Vidora. This feature enables you to build prediction-based models for your audiences, like churn risk, next best action, or even purchase propensity.
mParticle provides out-of-the-box software development kits (SDKs) to capture both web and mobile data. You can deploy these code snippets directly on your website to capture behavioral events that you define. The platform supports multiple different SDKs, ranging from Node, Go, Python, Ruby, Java, etc., and also client SDKs like Android, IOS, web, etc.
The actual data flow of mParticle is managed by input and output connections. Input connections ingest data into mParticle; output connections push that data back to your downstream destinations. From a tracking standpoint, the platform supports both server-side and client-side tracking. With server-side tracking, your data is forwarded from your web or client app to mParticle servers. Client-side tracking is the opposite; your data is forwarded directly from the point of origin to the output destination that you define. For example, in the image below, mParticle is syncing data to Amplitude from both the client-side and the server-side.
Within mParticle, you can collect two types of data: event data and user data.
- Event data refers to individual actions taken by a user in your app
- User data refers to unique identifiers (e.g., userID, email, etc.) or attributes to store custom values about your users.
You can also view six categories of events directly within the platform. This includes the following:
- Custom Events
- Screen Views
- User Information
- Application Lifecycle
One of the downsides to this overall architecture is that you’re limited to behavioral data, which means you’re often missing the critical context that only lives in your cloud data warehouse. If you have custom data science models or enriched events for your customers living in your data warehouse, this problem only compounds. Other data sources are not natively available, and trying to ingest this data into mParticle can create engineering challenges. Many companies are choosing to implement Composable CDPs on top of their existing data infrastructure for this exact reason.
By default, all of your data is stored natively within mParticle. Because mParticle is a traditional CDP that operates on its own separate infrastructure outside of your technology stack, there are limits on how long you can store that data. This time duration is governed by your long-term data retention policy with mParticle.
For example, if your data retention for events is set to two years, you can view specific events that occurred up to two years ago. However, if the retention period for profiles is set to one week, any user who hasn’t interacted in the last seven days will be unavailable for your marketing team. mParticle offers unlimited lookback, but this premium feature is not available out of the box.
With a Composable CDP, data lookback isn’t an issue because you have access to all of your data–not just clickstream data, and you don’t have to incur duplicative storage costs passed on by a CDP vendor.
mParticle provides a number of capabilities for managing your data schema and performing identity resolution, but both of these features come with their own challenges.
For event collection data, mParticle has six industry-specific templates and one generic data plan you can use to define your data model. The JSON schema structure that mParticle defines the structure of all the data you send to mParticle, and the plans let you select what user and event data you want to capture. These six data plans include the following:
One important factor to remember is that mParticle’s default tracking plans are limited to 1,000 data points. Your data also has to conform to the schema structure that mParticle supports if you want to use it to build audiences and sync them to your downstream destinations. Any data that does not conform to mParticle’s schema structure cannot reliably be delivered to your destination.
Traditional CDPs like mParticle require your data to conform to either a user or account object hierarchy. Depending on the complexity of your business, this can be a problem, especially if you need to support other custom entities or related models. This challenge is another reason companies prefer to leverage their existing infrastructure. Implementing a Composable CDP on top of your data warehouse gives you access to all of your existing data models because you own the schema structure.
Underneath the data quality suite, mParticle offers two products for identity resolution: IDSync and ComposeID.
IDSync is mParticle’s core identity framework for building unified customer profiles. This feature is powered by an Identity API that pulls in all the known identifiers of a current user and then maps those identifiers to individual profiles. With IDSync, you can merge and deduplicate your known and anonymous user identities so you can deliver consistent experiences and identify users at specific points in their journey.
Currently, IDSync only supports deterministic matching to unify your profiles, which means you have no ability to leverage probabilistic matching. You also don’t have much flexibility to adapt this identity resolution algorithm to your specific needs because the API acts as a black box that offers little to no visibility. Another downside to IDSync is the fact that you don’t own the identity graph, which makes it very difficult to leverage in other use cases.
On the other hand, ComposeID lets you resolve unidentified user data and profiles directly within your data warehouse. However, to use this feature, you must enable Warehouse Sync (more on this in the Reverse ETL section below.) While this feature is more flexible than IDSync, it’s very underbaked and was built solely to appease the pent-up demand around “composability” as more and more companies want to own their identity graph and manage their identity resolution algorithms directly in their data warehouse.
Once your event data is available within mParticle, the platform’s audience management tools let you create segments of users or customers based on criteria that you define and then sync those audiences to your operational tools. You can aggregate behavioral data and store it as calculated attributes like lifetime value, last page viewed, average order value, last purchase, etc. mParticle also supports journeys, so you can deliver personalized experiences across channels and route customers down different paths based on their behavioral tendencies. When building these workflows, you can also split journeys based on milestones or actions that you define. For example, you may want to deliver a different experience to a shopping cart abandoner compared to a recent purchaser.
For audience building, mParticle allows you to build two types of audiences: Real-time Audiences and Standard Audiences.
- Real-time Audiences are available to all mParticle accounts. This feature calculates audiences using your last 30-90 days of data, and they’re updated on an ongoing basis. However, these audiences are very limited because you don’t have access to the full range of data that lives in your data warehouse.
- Standard Audiences are not available by default. This is a premium feature that lets you build and define audiences using long-term historical data. With standard audiences, you can build segments using any data stored in mParticle. However, this feature limits you to a set number of calculations because the size is much larger. It can also take substantial time for audience calculations to be completed with mParticle, which is another reason why companies prefer Composable CDPs: they can leverage their own computing resources.
Real-time is an interesting topic because many companies have a different definition of what real-time actually means. mParticle offers two main solutions: Event Forwarding and Profile API
- Event Forwarding allows you to sync events directly to your output destinations as they’re generated on the client side. This feature is useful if you need to leverage your events as soon as possible. However, one of the downsides to this approach is that the data is raw, and you don’t have the ability to enrich it or transform it before it arrives at your destination. With a Composable CDP, you can join and transform data directly in your warehouse before ingesting it into your downstream tools.
- Profile API is low-latency API that lets you query against mParticle so you can pull in rich customer profiles, attributes, and other data available within mParticle via an HTTP request. This feature enables you to power one-to-one personalization on an ad-hoc basis as you need the data. It supports unlimited queries per second with an average response time of 20 milliseconds. Keep in mind this feature is standard across all CDP and Composable CDP vendors, so it’s not exactly a unique selling point.
The real-time audiences within mParticle are “real-time” solely in the fact that they’re re-calculated as you generate new events, but syncing those audiences does not take place in real-time. Another important factor to note is that mParticle currently does not support the ability to stream from tables in your warehouse via Streaming Reverse ETL.
If you want to sync data from your warehouse directly to destinations, mParticle offers a feature called Warehouse Sync. This product enables you to leverage your own data infrastructure as a source and route data directly from your data warehouse or data lake to your chosen destination. However, this feature is very immature compared to other Reverse ETL tools, and it lacks a lot of critical features around observability and version control.
Additionally, given how new this product is to the mParticle arsenal, it’s very unclear how Warehouse Sync integrates with all the other features the platform provides. Given that mParticle was originally designed to operate as a packaged CDP, it’s probably safe to assume that there are some compatibility issues pairing Reverse ETL with other features like audiences or analytics.
Since mParticle is a traditional CDP, your data will always be stored in the platform unless you rely solely on the Warehouse Sync feature. The good news is that this data is encrypted both in-transit and at-rest. The platform also offers a number of features to manage user access and permissions via role-based access (RBAC). Features like single sign-on (SSO) and multi-factor authentication (MFA) are also available within the platform. While the platform can be GDPR, CCPA, and HIPAA compliant, you must do some heavy engineering work in the actual implementation to be compliant.
Pros and Cons
If you’re looking for a traditional and packaged CDP, mParticle can be a great option. However, if you want to own your data structure, unlock greater flexibility, and leverage all of the customer data in your warehouse, only a Composable CDP can do that for you. With that in mind, here is a short list of pros and cons for mParticle
|Simple & easy to use
|Good with mobile apps
|Supports event forwarding
|Data is stored in mParticle outside of your infrastructure
|Many SDK options
|Identity graph is owned by mParticle
|Data must conform to mParticle’s schema structure
If you have no data footprint, a traditional CDP like mParticle is a great way to start powering personalization. However, if you already have customer data living in your warehouse, adopting mParticle will simply introduce duplicative storage costs and added complexity as your data team will be forced to manage two sources of truth.
The Composable CDP is the fastest and easiest way to start powering your marketing use cases. If you’re interested in learning more, book a demo with one of our solution engineers today to learn how Hightouch can help!