A Closer Look at Cross-Device Data: Deterministic vs Probabilistic

As online users bounce from desktop to mobile, cross-device tracking is now more than ever a crucial component for any digital marketing strategy. But how can you accurately pinpoint where your audience is? That is where deterministic and probabilistic data comes in.

In previous posts, we explained the difference between first-, second-, and third-party data. And how each can be used to optimize your marketing campaigns. But just as there are multiple ways to collect data, there are multiple ways to track data across devices.

So today, we are going to dive deeper into two subsets of data – deterministic and probabilistic data. This data is often used as the foundation for cross-device tracking. And it’s important to know they both come with their own set of pros and cons.

Read on to learn what deterministic and probabilistic data is and which type is best for your campaign goals.

Deterministic Data  


Deterministic data, also known as declared data, is personal information shared by filling out a form, taking an online survey, or making another purposeful action. It’s considered high-quality data because it is verified and true, coming directly from consumers. This data can be used as the base for content personalization and product recommendations – a tactic most e-commerce sites use. And it also plays a key role in cross-device targeting.

For example, if you log into Facebook on desktop, and later log into your account via mobile device, marketers can use your login data to link your devices. This means marketers can deliver personalized ads to you no matter what device you’re on. And because this information is factual, marketers can be sure they are hitting the right audience.

The downside is that deterministic data can be be tough to scale. And the two most scalable deterministic data sets are trapped within Facebook and Google’s walled gardens – not available for external use.

Probabilistic Data


Probabilistic data, also known as inferred or modeled data, makes assumptions about users based on online activities and behavior. Meaning they are probable matches. And marketers can assign individuals to specific categories depending on what they searched, read, watched or bought. This method is far more complex because it requires algorithmic calculations. However, advertisers can build out larger and more broader campaigns with highly probable device matches.

For example, a marketer serves an ad for a family vacation package to a desktop connected to a certain WiFi residential address. If later a mobile device connects to that WiFi, it becomes highly probable the device is part of the household. And the same ad gets served to that device to reach another household member who may influence the buying decision.

The downside to using probabilistic data is that there is a greater likelihood of missing intended audiences. For instance, if a husband buys a pair of earrings online for his wife’s birthday, probabilistic data will show he’s probably interested in other styles of earrings. And marketers will serve him ads displaying other earrings and even matching necklaces – all items that he’s unlikely to buy.

Which one is better?


Deterministic data may seem like the best choice because it’s more accurate and you want to get as close to your audience as possible. However, accuracy without scale won’t get you very far. It’s difficult to obtain deterministic data sets large enough to target audiences at scale. And probabilistic data provides that element of scale.

The key is to align your data sets with your objectives. If your goal is to target actual buyers of your product, then a deterministic data set would be the option of choice. But, if your goal is to target people who might be interested in your product then probabilistic data will show greater scale and potentially better conversions too.

Or if you’re a CPG brand, you might lean towards deterministic data because your consumers follow predictable purchase patterns. But, if you’re a car brand, your customers will not follow predictable purchase patterns. In this case, use data that reveals a highly probable path to purchase based on customers in the same position.

Marketers are also taking to a hybrid approach. A report by market research firm The Relevancy Group (TRG), found that 30% of surveyed marketers planned to use a combination of deterministic and probabilistic datasets in 2016. Layering factual data with inferred data gives you the best of both worlds: accuracy and scale. And thus will strengthen your cross-device tracking efforts.


In the end, there is not a right or wrong model. Savvy data-driven marketers will test and choose the data sets that best match users across devices and drive the most impact.



Which types of data do you find most effective? Leave a comment to share your thoughts.


Published on May 25, 2017

Leave a Reply

Your email address will not be published.