Ever onboard your data into Resonate or any other consumer insights platform and wonder why your data in that platform does not conform to your understanding of the origin data?
For instance, you may onboard a file comprised of anonymous older affluent male buyers. But when you onboard the data into a 3rd party platform and analyze it, you see a split of men and women. Wonder what happened? Most likely this occurred because of what happens when data is onboarded, which results in a loss of data fidelity.
Householding and the resulting expansion when onboarding and matching first-party offline data to anonymous online data is an industry-wide challenge that you may experience when onboarding your data into Resonate. This challenge is not unique to Resonate – it is an issue that all digital marketers should be aware off so they can better understand and combat the effects of scaled onboarding.
This loss of data fidelity occurs as you expand your audience, and it starts to look more like the general population – in the case of Resonate, the online adult population.. Think of it like this, you - the customer, the matching partner, and Resonate all have different prescription glasses, so when you view your data through, the greater the difference in how it looks. The more lenses you pass your data through, the fuzzier it will get. However, this expanded audience is still actionable. It is simply enriched or expanded – and you can still market to these people the same way as you would to your original audience.
Let’s walk through the example of our CRM list of older affluent male buyers.
When this file is onboarded for matching, it will typically be put through a waterfall match process that may lead to house-hold level matching. This means that the resulting matched file for your onboarded data may include other members of that household, such as the wife of the older affluent male buyer. This matching process brings in demographics that are different from your original target - like gender and age, which changes the insights rom what you are expecting them to be reported as at an individual level.
As an example, a starting audience that is known to be predominantly male may end up reporting as 50% male and 50% female once matching is used to onboard the data. This can cause insight expansion, for instance, the wife in the household may have an affinity for organic products - and that insight will be shown for and then used to target your older affluent male buyer audience when it may only partially apply.
Now that we have defined how matching processes may work and cause data expansion, let’s dig deeper into why this happens.
Why does insight expansion occur?
Onboarding offline consumer data with individual-level precision is an industrywide challenge that is not as simple as achieving maximum reach across your first-party data sets. While reach is certainly important, accuracy is the real challenge when onboarding first party data sets for insights and digital activation. It can become “compromised” (or expanded) due to the reliability of the matching methodology.
When onboarding and mapping data from a third-party data source, there will always be some loss of data fidelity. This means that the data can start to look like the Adult Online Population (AOP) rather than the source records that you onboarded – just like our example of onboarding a list of 100% male consumers and receiving results with a 50% male and 50% female split. This problem is exacerbated due to the nature of onboarding offline data for a couple of reasons:
1. Reliability of Matching Partners: To maintain privacy, marketers must rely on intermediary safe-harbor matching partners to remove Personally Identifiable Information (PII) when doing offline-to-online matching. This puts the onus of matching and scale on the matching partner rather than the marketer.
Marketers often perform reach tests when engaging a new safe-harbor matching partner to test their reliability. However, it is important to note that reach tests miss the real challenge of accuracy and precision and can add raw scale without improving return on ad spend.
2. Probabilistic Matching & Waterfall Method: Deterministic matching is the "gold standard" approach for online identification as it is based on knowing a match is viable versus guessing at a match. However, probabilistic matching dominates the industry as people do not readily share the types of personal information necessary to make a strong deterministic match online – like emails, phone numbers and other PII.
As a result, most matching partners use a hybrid of deterministic and probabilistic matching methods. A common hybrid matching technique is the "waterfall" method, where the starting record is passed down a cascading list going from the strongest and most accurate match method down to the least accurate. The waterfall matching process attempts to make the strongest match possible (like email or phone number) as high up the cascade as possible. If no match is usually found using these methods, the record is passed down to weaker matching methods, ultimately ending at a default household-level match.
3. Household Level Matching: When mapping data, house-holding is often used to maintain high match rates and produce a scaled output (or the highest number of matches). This means that the match is not happening at the individual level - which directly correlates to your insights no longer being reported at the individual level.
Household-level matching negatively impacts individual-level insights as a significant number of records matching at a household level will not be reflective of the starting anonymous data’s composition. Like in our earlier example of the organic loving wife being included in the household of the affluent male buyer list. Marketers can wind up with a targeting universe where 40-50% or more of the records matched do not correctly represent the individuals they initially onboarded.
Resonate is aware of this industry-wide challenge and only partners with best-of-breed matching providers who are acutely aware of matching process limitations and are continuously working to improve those processes to improve matching to increase the accuracy of onboarding and modeling offline to online data. The insights you are receiving are still valuable to use on a directional level. Just be aware that due to these challenges in the ID matching space, the insights you are receiving now reflect an expanded audience that is less precise or more generic than the original audience's definition due to the reasons outlined above. The expanded audience’s level of precision is still meaningful and relevant for audience building, activation with Resonate attributes, directional insights and segmentation.