Blog

Product

Incrementality Tests 101: Intent-to-treat, PSA and Ghost Bids

Text Link

Blog

Product

Incrementality Tests 101: Intent-to-treat, PSA and Ghost Bids

September 23, 2019

Blog article: Incrementality Tests 101: PSA, Ghost Ads & Ghost Bids

« When a measure becomes a target, it ceases to be a good measure. »

Goodhart's Law

Economist Charles Goodhart nailed it in his adage. In a popular application of this law, two workers are asked to create nails. One's performance is measured on the number of nails they made while the other is judged on the weight of the nails made. What was the result? A thousand nails for the first worker, and only a few heavy ones for the second. Their performance incentives are tied directly to their goals.

In mobile marketing, we observe similar issues, depending on each advertiser's goals. With shifting industry goals around performance KPIs come new attribution models, and more often than not problems along with them (fraud, cannibalization, last-click stealing, and so on). As seen in Goodhart's law, wrong metrics can lead to wrong incentives, which ultimately lead to advertisers and other industry professionals to the question "how much of the total revenue is truly caused by my advertising spend?"

Understanding the relationship between ad spend and revenue is the only way to mitigate the risk of ineffective spending and budget. While attribution models are easy to implement and most commonly used, these models don't scientifically prove causation. Cum hoc ergo propter hoc, or in other words "correlation does not imply causation".

Applying Medical Methodologies in Advertising

Enter Randomized Control Trials (RCT). RCT is a widely used concept for multiple scientific practices. In clinical trials, RCT shows whether the new drug, device, treatment and such truly have an effect on the user (positive, negative, or none).

"Randomized controlled trials (RCTs) are widely taken as the gold standard for establishing causal conclusions. Ideally conducted they ensure that the treatment 'causes' the outcome—in the experiment."

- Nancy Cartwright, "What are randomized control trials good for?"

In retargeting, the ad is given in place of medication

RCT is done by splitting the subjects into two groups and testing one with the new medication. One group, namely the test group, is given new treatment, while the control group remains untreated. Any observed change within the test group is noted as an effect of the treatment. As RCT methodologies scientifically prove causality, using these concepts for incremental testing gives an unbiased reflection on the efficacy of mobile remarketing.

In retargeting, the target audience will also be split into two groups, where the test group receives advertising campaigns, and the control group does not. In incrementality measurement, there are a few methodologies that differ mainly in terms of what happens to the control group and how to look at the results.

Different Methodologies Applied in Advertising

Intent-to-Treat (ITT)

No cost and easy to implement but with noisy data

Let's begin with the intent-to-treat methodology. In all cases, the ITT method does not show any ads to users within the control group. Also known as the 'holdout test', this approach measures uplift by:

1. Complete comparison: This method compares the behavior of all users in both groups (both exposed and unexposed to ads in the test group with the behavior of the control group).

This method is the correct way of applying Intent-to-treat as it answers the question: "How much more are the users who are targeted by my ads spending compared to a control group?"

While correct, this method creates the problem of noisy data (more below on noise).

2. Partial comparison (naive method): This method compares the behavior of only exposed users with the probable behavior of the control group.

In an effort to reduce the noise from all unexposed test group users, this second approach to ITT is a common mistake that leads to selection bias because of the availability of supply, lost auctions, and impression rendering.

The Intent-to-treat methodology for incrementality measurement is widely used because of its ease of implementation on the client-side (i.e. BI team or data science) and won't need to be integrated into the ad delivery system of the advertising partner.

However, there are situations where only a small fraction of the users in the treatment group will be exposed to the ads. This can happen due to the low availability of the targeted audience on the supply channel or low win rate on programmatic auctions. Consequently, including a potentially large proportion of unexposed users in the treatment group makes the measurement noisy.

Noise in data

Noise, a.k.a. meaningless information stems from the unexposed population of the test group. Noise often leads to unsuccessful uplift tests with no statistical significance nor uplift. This comes from the fluctuations in the unexposed group's behavior which overshadow the uplift stemming from a small exposed population.

The problem with 'noise'

Have a look at the graphic below:

If we put the same rock in a bucket of water and into a swimming pool, where would it be easier to visually measure the change in the water levels? Easy answer - the bucket. The aim is to see the difference, and in uplift measurement the swimming pool represents a large amount of unexposed users, making it hard to detect the impact of a rock.

Public Service Announcement (PSA) or Placebo Ads

The zero noise but costly and unsustainable alternative

In the 'PSA/Placebo' methodology, the control group receives a real ad. The idea is to serve the control group public service announcement (PSA) ads - ads that help raise social awareness like red cross banners or don't drink and drive ads.

By serving real ads, we obtain the information on which users within the control group would have been exposed, allowing us to exclude unexposed users from the measurement. This reduces noise to zero.

While PSA ads are easy to use in self-service platforms and are a great workaround for the noise problems drawn from Intent-to-treat testing - in fact, PSA uplift testing generates no noise at all - this method is costly and unsustainable. Marketers would have to provide a portion of their budget to pay for the control group impressions, consequently reducing their profit margin. Since incrementality testing and measurement are most effective when ran continuously (not a one-time test), allocating budget for PSA ads isn't really an appealing long-term strategy.

PSA ads can also be inaccurate if implemented incorrectly. For example, when running two different campaigns in a smart ad delivery system, the system is likely to optimize the delivery of the two campaigns differently because the ads are inherently different:

"[The system] will show ads to the types of users that are most likely to click. And the users who choose to click on an ad for sporting goods or apparel are likely to be quite different from those who click on an ad for a charity - leading to a comparison between “apples and oranges.” Hence, such PSA testing can lead to wrong results ranging from overly optimistic to falsely negative.” - Think with Google

It would also be an assumption that the control group behavior is completely comparable to the test group. PSA testing ignores the chance that the same user might react more strongly to don’t drink and drive ads than say, unlock-a-new-character-in-the-next-level-play-again type of ad. The definitions of the groups are ultimately distorted, making it impossible to have a real comparison.

Ghost Ads

The best of both ITT and PSA, but is more suitable for User Acquisition

As discussed in several publications, the Ghost Ads concept offers the most advantages in incrementality measurements. This includes low noise and lowest selection bias with the further advantage of being free of cost for the advertiser - the control group is not exposed to your ads, therefore incurring no additional costs (as opposed to the PSA approach).

Ghost ads offer an improvement of the PSA concept by removing the costs for the advertiser. The control group users are shown another ad ran by another advertiser on the platform, therefore removing the cost for clicks and impressions. The control group behavior is then marked with a "ghost impression", giving us the information on which control group users would have been exposed.

While Ghost Ads are precise and accurate, the method doesn't work well for retargeting. Ghost ads require a second interested party for the user in question. This is easily done within the user acquisition space because a user is potentially relevant/interesting for multiple advertisers. In retargeting, it is often the case that there is only one party interested in the user at hand since retargeting campaigns usually target very specific and narrow user segments.

Ghost Bids

A more precise and cost-free incrementality testing method for app retargeting

Similar to Ghost Ads, the goal is to remove as much noise and unexposed users as possible. We used the concept behind Ghost Ads as the foundation for our continuous uplift tracking product, though our implementation differs from the original implementation in several points to better adapt it for retargeting, therefore calling our approach 'Ghost Bids'.

All users that are not found on RTB from both groups are removed, thus significantly reducing noise.

To be able to measure incrementality, we track revenue and conversions for all 'reachable users' - those who fall into the target segment and are seen on RTB ad exchanges, and for which we can place a bid. A bid is placed as usual for the test group, whereas the control group is tracked with "Ghost Bids" (A bid could have been placed, hence the name 'ghost bid'). Users who fall into the target segment but are not seen on ad exchanges are not part of the test.

Noise is therefore significantly reduced compared to the Intent-To-Treat method since users who are not available on RTB supply are not corrupting the measurement (their behavior is irrelevant for our campaigns).

This means we have two groups of users in the treatment group: exposed (seen at least one impression) and unexposed (we didn't win any auctions for this user or none of the impressions rendered).

Similarly, in the control group, there will be users who 'would have been exposed' and 'would not have been exposed'. Although the unexposed share of users creates additional noise in the test, there is no easy way to predict which users in the control group would have been exposed without adding any potential bias to the test.

Wrapping Up: Incrementality Measurement

Understanding the concept of incrementality testing is the first step in having a more scientifically-based figure of your ROI. Knowing the different methodologies along with their pros and cons is the next step in making informed decisions.
Having the right incrementality measurement tools and methods provide the right incentives for both advertisers and vendors. Incrementality as a strategy will eventually make mobile retargeting a more transparent, measureable, and less fraudulent channel in the long run.

For further readings on Incrementality Measurement, we've compiled a few scientific papers in another blog post.

Event photos

Suggested findings

More thoughts. More stories.

all findings

Code of conduct Terms and conditions Website privacy policy Service privacy policy Vulnerability disclosure policy Opt out of ads Imprint Privacy settings

Retargeting lexicon

Programmatic Advertising

The automated process of buying and selling advertising space through digital platforms.

See more entries

View-Through Attribution

view-through-attribution

Refer to: Attribution Methodology

‍

Uplift Test

uplift-test

A randomized control trial test conducted by Remerge to measure the incremental impact of one or more campaigns.

further reading

— Test Group

— Control Group

Queries Per Second (QPS)

queries-per-second-qps

The number of ad placements a DSP is able to process in order to determine on how to bid on them.

Publisher

publisher

Within the sphere of app marketing, a publisher is an App Developer that gets paid for placing ads within their app. For example, an advertiser wants to reach their users via App Y, so they pay App Y to display their ads.

further reading

— App monetization

Public Service Announcement Ad (PSA Ads)

public-service-announcement-ad-psa-ads

An incrementality testing methodology where devices in the control group are shown PSA ads, like donation drives or road safety reminders. By serving real ads, information on the devices within the control group that would have been exposed can be obtained. Unexposed devices are excluded from the measurement to reduce noise.

further reading

— Incrementality Tests 101

Probabilistic Attribution

probabilistic-attribution

Refer to: Attribution Methodology

Organic Behavior

organic-behavior

A user’s behavior not directly attributable to specific marketing efforts.

Multi-Touch Attribution

multi-touch-attribution

Refer to: Attribution Methodology

Mobile Measurement Partner (MMP)

mobile-measurement-partner-mmp

Within the sphere of app marketing, MMPs are a service provider that specializes in measuring activities that are happening within and leading to the app. An app publisher may incorporate an MMP into their app to track activity and events e.g. time spent on a certain screen, sources of incoming traffic, app opening frequencies etc.

Lifetime Value (LTV)

lifetime-value-ltv

The amount of revenue generated by the user for the App Developer during the entire duration of the relationship with the user, beginning with the app install.

Last-Click Attribution

last-click-attribution

Refer to: Attribution Methodology

Key Performance Indicator (KPI)

key-performance-indicator-kpi

The key metrics used to assess the effectiveness of an effort in achieving its objective. In programmatic advertising, the common types of performance indicators depend on the goals and nature of each campaign. These can include ROAS, cost per action, and retention rate.

further reading

— The KPI Shift: Why You've Got Wrong App Retargeting Metrics

Intent-to-Treat (ITT)

intent-to-treat-itt

An incrementality testing methodology where no ads from the campaign are shown to devices within the control group. Also known as a ‘holdout test’. Cost-free and easy to implement, but with a relatively high level of noise.

This method compares the behavior of all users in both groups. In the test group, this includes both exposed and unexposed users

further reading

— Incrementality Tests 101

Incrementality

incrementality

A method of measuring the impact of a specific activity, on top of organic and other activity.

Incremental Revenue (iRevenue)

incremental-revenue-irevenue

The estimated revenue caused directly by the campaign.

Formula:Revenue from test group – revenue from control group = iRevenue

Incremental Return On Ad Spend (iROAS)

incremental-return-on-ad-spend-iroas

A KPI used in calculating how cost-efficient a campaign is. This is used to evaluate the relationship between incremental revenue and the amount of money spent on the campaign. The figure is typically represented in percentage.

Formula:
Percentage: [IRevenue ÷ ad spend] × 100 = IROAS%
Ratio: IRevenue ÷ ad spend = IROAS

further reading

— ROAS

— The Pros and Cons of Attribution and Incrementality

Incremental Cost Per Action (iCPA)

incremental-cost-per-action-icpa

A KPI used to evaluate the cost of incremental conversions.

Formula:Ad spend ÷ [test group actions – control group actions] = iCPA

Incremental Conversions

incremental-conversions

The estimated amount of conversions caused directly by the campaign.

Formula:
Test group conversions – control group conversions (scaled) = Incremental conversions

In-app Event

in-app-event

Actions made by a user within the app, such as log-in, registration, completion of onboarding, or purchases. These events can be tracked with the help of an MMP.

Impression

impression

The deployment of the ad to the ad placement. An impression might not necessarily mean that the ad has been viewed.

Identifier for advertisers (IDFA)

identifier-for-advertisers-idfa

A unique random device identifier Apple generates and assigns to every iOS device. Advertisers can use this to track user activity across apps, show them personalized ads, and attribute ad interactions.

Ghost Ads

ghost-ads

A testing methodology that shows devices in the control group an ad ran by another advertiser on the platform, therefore removing any additional cost for clicks and impressions. The control group behavior is then marked with a ‘ghost impression’, which gives the information on which control group users would have been exposed.

further reading

— Atribution

— Incrementality Tests 101

General Data Protection Regulation (GDPR)

general-data-protection-regulation-gdpr

A regulation under the EU (European Union) law on data protection and privacy within the EU and the EEA (European Economic Area), that grants users control over how their data is stored and used by organizations. To comply with GDPR, programmatic sellers must clearly communicate to users how their data will be stored and used. When a user gives consent to an organization to process their data, it enables targeted advertising.

Exposure Rate

exposure-rate

The percentage of devices within a test group that received at least one ad impression, versus the total number of devices within the test group targeted within a campaign during an uplift test. For example, if 900 out of 1,000 users are shown an ad, the exposure rate is 90%.

See also: Uplift Test

Deterministic Attribution

deterministic-attribution

Refer to: Attribution Methodology

Deep link

deep-link

A link that sends users directly to a specific in-app location, instead of the app marketplace. Deep links bypass the steps needed to go through to reach a conversion point, bringing the user directly to where they can perform the intended action e.g. completing a purchase, buying coins, placing an order.

further reading

— Case Study: Why Deep Links Matter for Your Revenue Goals

Test Group

test-group

Within the sphere of app marketing, this refers to the group of devices that may be shown ads from a specific campaign in the test. The actions on these devices are then compared to the actions on the devices in the control group.

Compare with: Control Group

further reading

— Control Group

Control Group

control-group

Within the sphere of app marketing, this refers to the group of devices within the target audience that are not shown ads from a specific campaign in the test. The actions on these devices are then compared to the actions on the devices in the test group.

Compare with: Test Group

further reading

— Test Group

Contextual targeting

contextual-targeting

A type of targeting that works with contextual signals only, such as location data (country, city, postal code), language setting, mobile operating system, device model, as well as publisher information.

California Consumer Privacy Act (CCPA)

california-consumer-privacy-act-ccpa

A bill that enhances privacy rights and consumer protection for residents of California, United States. The CCPA took effect on January 1, 2020.

The CCPA provides these rights to consumers:

- Know what personal data is being collected about them.
- Know whether their personal data is sold or disclosed, and to whom.
- Say no to the sale of personal data.
- Access their personal data.
- Request a business to delete any personal information that was collected from that consumer.
- Equal service and price, even if they exercise their privacy rights.

Attribution Window

attribution-window

A specific time frame that is taken into consideration when determining the source of a user’s action.

Attribution Provider (AP)

attribution-provider-ap

A role played by an MMP to credit the in-app activity of users to the correct media sources.

Attribution Methodology

attribution-methodology

Refers to the process of identifying which conversions belong to which preceding click or impression. Common attribution methodologies include:

Click-Through Attribution - Determines the source of a conversion based on the user’s click activity.
View-Through Attribution - Determines the source of a conversion based on the ad impression delivered to the user.
Deterministic Attribution - A model that establishes the origin of a user’s conversion from a specific click or impression, based on unique device IDs.
Probabilistic Attribution - A model that establishes the likelihood of a user’s conversion originating from a specific click or impression, based on the data logged on both occasions, such as device language, timezone, IP address, and OS version.
Last-Touch Attribution - A model that establishes a match between the action taken by a user (e.g. app open, purchase) and its corresponding ad click or impression. When a user converts from an ad, the DSP that delivered the respective ad is given full credit for that conversion event.
Multi-Touch Attribution - Also known as multi-channel attribution. A model determines the value of every touchpoint on the way to a conversion. Rather than giving full credit to one ad, multi-touch attribution divides the credit among all advertising channels that the user has interacted with, leading to the conversion.

Attribution

attribution

A method of identifying the touchpoints a user has encountered within a specified period before making a conversion.

further reading

— Atribution

— The Pros and Cons of Attribution and Incrementality

App Tracking Transparency (ATT)

app-tracking-transparency-att

The privacy framework from Apple that, among other things, manages the process of obtaining user consent before accessing their Identifier for Advertiser (IDFA).

App Monetization

app-monetization

The strategy a publisher employs to earn money from their app. This can be done through in-app advertisements, paid membership, and charging for premium features or an ad-free experience, among others. For example, some gaming apps are free to download and play, but users may need to pay in order to progress to the next level quickly.

Android Advertising identifier (AAID)

android-advertising-identifier-aaid

Also known as Google Advertising Identifier. A unique device identifier that Android generates and assigns to every device. Advertisers can use this to track user activity across apps, show them personalized ads, and attribute ad interactions.

Advertisers

advertisers

The advertiser is a person or legal entity focusing on generating sales and leads through serving ads that convey the right message to the right audience at the right time.

In mobile advertising, the advertiser is on the client-side and is the one interested in promoting an app.

Causal Impact Analysis

causal-impact-analysis

A measurement framework developed by Google that works without device IDs. It measures the incremental uplift of one or more conversion events, removing the influence of other campaigns and organic conversions. Used to assess the effect of ID-less campaigns.

Similar to measuring the effect TV ads have, the principle is based on running campaigns on identifiable sub-markets (test group), while leaving other sub-markets unexposed (control group).

Ghost Bids

ghost-bids

An incrementality testing methodology based on Ghost Ads, adapted for retargeting campaigns. The difference is that it removes all devices that are not seen on ad exchanges, or that would not be bid on, from both test and control groups, to reduce noise. A bid is placed as usual for the test group, while the control group is tracked with ‘ghost bids’ (bids that could have been placed, but weren’t in the end).

Return on Advertising Spend (ROAS)

return-on-advertising-spend-roas

A KPI that measures the relationship between the revenue generated by specific advertising efforts and the money spent on them.

Formula

Percentage: [Revenue ÷ ad spend] × 100 = ROAS%

Ratio: Revenue ÷ ad spend = ROAS

See also: Incremental Return On Ad Spend

Supply-Side Platform (SSP)

supply-side-platform-ssp

A company that works with publishers to sell ad inventory across ad networks.

Demand-Side Platform (DSP)

demand-side-platform-dsp

A company that works with advertisers to purchase ad inventory across ad networks. Their platforms are built to identify a desired ad space and place bids on it.

Compare with: Supply-Side Platform

Open RTB

open-rtb

A digital marketplace where ad inventory from multiple publishers are available for advertisers to bid on in real time.

See also: Real-Time Bidding

Self-Attributing Network

self-attributing-network

An ad network like Meta, Snap, and Twitter, that attributes its traffic internally, without the involvement of third-party MMPs.

Variable Bidding

variable-bidding

The dynamic adjustment of bid prices based on a user's in-app behavioral patterns, contextual information, time of day, and ad placement performance.

Dynamic Product Ad (DPA)

dynamic-product-ad-dpa

Also known as a dynamic ad. It is dynamically assembled based on the user’s behavior and information sourced from a feed. This type of ad delivers a tailored experience for individual users.

Real-Time Audience Segmentation

real-time-audience-segmentation

The division of an audience into distinct segments based on real-time events, thus enabling targeted advertising and alignment with a user's behavioral patterns and preferences.

User Acquisition (UA)

user-acquisition-ua

A mobile marketing effort used to attract new users to an app. Paid UA may refer to ads shown in mobile ad networks or social media channels, while non-paid UA involves app store optimization and promotion on the advertiser’s own channels.

Programmatic Advertising

programmatic-advertising

The automated process of buying and selling advertising space through digital platforms.

Incrementality Tests 101: Intent-to-treat, PSA and Ghost Bids

Incrementality Tests 101: Intent-to-treat, PSA and Ghost Bids

Applying Medical Methodologies in Advertising

In retargeting, the ad is given in place of medication

Different Methodologies Applied in Advertising

Intent-to-Treat (ITT)

Noise in data

Public Service Announcement (PSA) or Placebo Ads

Ghost Ads

Ghost Bids

Wrapping Up: Incrementality Measurement

Subscribe

More thoughts. More stories.