In digital advertising, a displayed ad is not necessarily seen by the user. An ad might load below the fold, in a background tab, or be scrolled past before it renders. This distinction between displayed and viewed has real consequences for how we model click-through rates. This post derives a simple correction factor that links the observed CTR to viewability as $$P(\text{click} \mid \text{channel}) = Q \cdot \bigl[\, V + r \cdot (1 - V) \,\bigr]$$ with Q as the insinsic ad quality, V as its viewability and r as false-click rate.
Setup
We consider three random variables and one latent event:
- click — the user clicks the ad
- display — the ad is served and rendered on the page
- channel — the placement or context (site, position, device, etc.)
- view — the ad is actually seen by the user (latent)
The quantity we often observe in logs is $P(\text{click} \mid \text{channel})$, but what we really care about is the true click rate conditioned on the ad being viewed: $P(\text{click} \mid \text{channel}, \text{view})$.
Derivation
Start from the joint probability and marginalize over the latent view event:
$$P(\text{click}, \text{channel}) = P(\text{click},\text{channel}, \text{view}) + P(\text{click},\text{channel}, \neg\text{view})$$Applying the Bayes' rule to each term:
$$P(\text{click}, \text{channel}) = P(\text{click} \mid \text{channel}, \text{view})\, P(\text{channel}, \text{view}) + P(\text{click} \mid \text{channel}, \neg\text{view})\, P(\text{channel}, \neg\text{view})$$Dividing both sides by $P(\text{channel})$:
$$P(\text{click} \mid \text{channel}) = P(\text{click} \mid \text{channel}, \text{view})\, P(\text{view} \mid \text{channel}) + P(\text{click} \mid \text{channel}, \neg\text{view})\, P(\neg\text{view} \mid \text{channel})$$Viewability and the false-click factor
We introduce two quantities to simplify the expression:
Viewability score. The probability that a displayed ad is actually viewed in a given channel:
$$V = P(\text{view} \mid \text{channel})$$False-click factor. A scalar $r \in [0, 1]$ that captures the ratio of click probability when the ad is not viewed versus when it is:
$$P(\text{click} \mid \text{channel}, \neg\text{view}) = r \cdot P(\text{click} \mid \text{channel}, \text{view})$$False clicks can be attributed to multiple factors such as accidental clicks, bots. When $r = 0$, users never click ads they haven't seen (the ideal case). When $r > 0$, there is some residual click rate from accidental clicks, bot traffic, or misattributed events. In practise we can expect $r$ to be negligible overall.
Substituting into the boxed equation and writing
$$Q = P(\text{click} \mid \text{channel}, \text{view})$$leading to
$$\boxed{P(\text{click} \mid \text{channel}) = Q \cdot \bigl[\, V + r \cdot (1 - V) \,\bigr]}$$The term in brackets is a viewability correction factor. It tells us how much of the true CTR is preserved in the observed CTR, as a function of viewability $V$ and false-click rate $r$.
Interpretation
- When $V = 1$ (every displayed ad is viewed), the correction factor equals $1$ regardless of $r$, and the observed CTR matches the true CTR.
- When $V = 0$ (no ad is viewed), the correction factor equals $r$ — the observed CTR is purely from false clicks.
- For intermediate values, the correction factor interpolates linearly between $r$ and $1$.
Visualisation
The plot below shows the correction factor $V + r\,(1 - V)$ as a function of viewability $V$, for several values of the false-click factor $r$.