The Importance of First-Party Data in RTB Campaigns

May 13, 2020

Over the past five years, real-time bidding (RTB) has radically transformed the digital marketing landscape. Thanks to these algorithm-fueled auctions, the right RTB makes it easier than ever for advertisers to target the right audience at the right price. Between 2014 and 2018, the share of ad spend attributed to RTB tripled in the US, and it’s predicted to reach $29 billion by the end of 2025. These days, RTB and programmatic are top-of-mind for marketers, and for good reason — but remember: RTB runs on data. Without first-party data from advertising partners, demand-side platforms (DSPs) like Moloco can’t deliver the best possible results.

You might be wondering why first-party data is necessary to run a successful RTB campaign, and that’s fair! There are four main reasons why data sharing is so critical:

Advertiser data increases the accuracy of machine learning models
ML models determine the probability of secondary conversion for each impression
These predictions can be used to optimize bids per-impression in real-time
Optimized bids mean more secondary conversions at a better price

Our machine learning platform requires both attributed and unattributed data points to perform and evolve. This engine determines the bid price of each upcoming request from exchanges, and we can’t effectively help clients achieve their goals without the initial data, which might vary by client and campaign. For example, in user acquisition campaigns, MOLOCO typically requests data about high-value existing users, and as you can see, it makes a real difference.

The graph above shows a campaign’s CPA (in dark blue) compared to the volume of secondary conversions (light blue) — in this case, in-app purchases. Before the advertiser opted to share attributed and unattributed data, the CPA was high and volatile for several weeks. Meanwhile, purchase volume remained low. In July, the advertiser began sharing data. The results were clear and immediate: The CPA decreased and stabilized, and as the ML models learned, secondary conversions steadily increased.

To truly understand why the accuracy of ML models depends so heavily on first-party data, we’ll need to dig a little deeper. Let’s start by reviewing how DSPs develop them in the first place.

How RTB platforms create machine learning models

Let’s go back to the UA campaign example — if the goal of the campaign is to optimize on a cost-per-install basis, the bidding price for each potential impression is determined based on the estimated install probability. In training such a model, each past ad impression becomes one training sample, and whether the past impression resulted in an app install or not becomes the target label. To train an accurate model, sufficient amounts of training samples, particularly positive samples, are required, along with diverse past impressions from different publishing apps, wide groups of users, and various creative formats.

When the campaign goal is more granular than to simply drive app installs, first-party data becomes even more critical. For example, to train a model to drive post-install events, such as in-app purchases, the platform needs positive samples — i.e., data on past impressions that resulted in an install and an in-app purchase. In a case like this, these positive training samples will occur less frequently — among 1% of impressions or less — so having access to historical user data can train the model faster. The table below shows the average frequency of secondary conversion events across various formats.

The cost of limited data transparency

Without the appropriate data from the start, advertisers will end up spending much more time and money until the machine learning engine becomes sufficiently accurate. This obviously isn’t ideal for a lot of reasons, not least of which is wasting ad spend.

When existing high-value user data is supplied to Moloco's engine, the model learns not only from the past ad impressions, (which requires budget consumption) but also from the profiles of the existing high-value users, which can be obtained free of ad spending. Learning from existing high-value users is a similar concept to “lookalike” audience targeting, available on platforms like Facebook and LinkedIn. When training lookalike audience models for a single platform, there are a finite number of ad slots available. Thus the exploration for learning effectiveness of different publishers and ad slots is not necessary. Meanwhile, in the open exchange, models must account for differences across publishers and different user segments.

For context, here is how complex the programmatic ecosystem can be in comparison to closed systems like Facebook or LinkedIn:

Moloco bids on a massive volume of publishers. In fact, we bid on impressions across over 2.5MM apps every month. For every app, predicted conversion probabilities will be different.
Publisher heterogeneity is significant. This is true even when predicting probabilities for the same user and the same campaign. The median coefficient of these predicted conversion probabilities is 1.22. This is found by taking the standard deviation over the average — a coefficient of 0 would mean that variation is not present, whereas 1.22 implies significant heterogeneity.
User heterogeneity is equally significant. The median coefficient of variation of predicted conversion probabilities across different users for the same publishing app and campaign is 1.90.

To mitigate these factors, our platform seamlessly integrates first-party data. We train our models based on data from our partners’ existing high-value users and with what we’ve learned from direct campaign operation results in real-time. Using our algorithm and your first-party data, MOLOCO can help you hit challenging ROAS targets in UA campaigns — normally within an average of 3–4 weeks. In best case scenarios, we can hit targets within as little as two weeks. Compare that with the time to yield without using such data, which can be as long as three months.

Some RTB providers launch campaigns without first-party data (usually with the caveat that performance will struggle at first), but Moloco is committed to delivering the best possible results on your behalf. That’s why we choose not to engage in RTB campaigns without first-party data. As you can see, it’s absolutely critical for building accurate ML models — and that accuracy yields better bidding prices and stronger ROAS.

To find out more about Moloco's RTB technology, check out our product info page. If you’d rather utilize the power of our machine-learning algorithms while keeping your data in-house, Moloco Cloud is at your disposal. If you’re ready to unleash the full power of your data, get in touch today!