In digital advertising, a displayed ad is not necessarily seen by the user. An ad might load below the fold, in a background tab, or be scrolled past before it renders. This distinction between displayed and viewed has real consequences for how we model click-through rates. This post derives a simple correction factor that links the observed CTR to viewability as $$P(\text{click} \mid \text{channel}) = Q \cdot \bigl[\, V + r \cdot (1 - V) \,\bigr]$$ with Q as the insinsic ad quality, V as its viewability and r as false-click rate.

Setup

We consider three random variables and one latent event:

The quantity we often observe in logs is $P(\text{click} \mid \text{channel})$, but what we really care about is the true click rate conditioned on the ad being viewed: $P(\text{click} \mid \text{channel}, \text{view})$.

Derivation

Start from the joint probability and marginalize over the latent view event:

$$P(\text{click}, \text{channel}) = P(\text{click},\text{channel}, \text{view}) + P(\text{click},\text{channel}, \neg\text{view})$$

Applying the Bayes' rule to each term:

$$P(\text{click}, \text{channel}) = P(\text{click} \mid \text{channel}, \text{view})\, P(\text{channel}, \text{view}) + P(\text{click} \mid \text{channel}, \neg\text{view})\, P(\text{channel}, \neg\text{view})$$

Dividing both sides by $P(\text{channel})$:

$$P(\text{click} \mid \text{channel}) = P(\text{click} \mid \text{channel}, \text{view})\, P(\text{view} \mid \text{channel}) + P(\text{click} \mid \text{channel}, \neg\text{view})\, P(\neg\text{view} \mid \text{channel})$$

Viewability and the false-click factor

We introduce two quantities to simplify the expression:

Viewability score. The probability that a displayed ad is actually viewed in a given channel:

$$V = P(\text{view} \mid \text{channel})$$

False-click factor. A scalar $r \in [0, 1]$ that captures the ratio of click probability when the ad is not viewed versus when it is:

$$P(\text{click} \mid \text{channel}, \neg\text{view}) = r \cdot P(\text{click} \mid \text{channel}, \text{view})$$

False clicks can be attributed to multiple factors such as accidental clicks, bots. When $r = 0$, users never click ads they haven't seen (the ideal case). When $r > 0$, there is some residual click rate from accidental clicks, bot traffic, or misattributed events. In practise we can expect $r$ to be negligible overall.

Substituting into the boxed equation and writing

$$Q = P(\text{click} \mid \text{channel}, \text{view})$$

leading to

$$\boxed{P(\text{click} \mid \text{channel}) = Q \cdot \bigl[\, V + r \cdot (1 - V) \,\bigr]}$$

The term in brackets is a viewability correction factor. It tells us how much of the true CTR is preserved in the observed CTR, as a function of viewability $V$ and false-click rate $r$.

Interpretation

Visualisation

The plot below shows the correction factor $V + r\,(1 - V)$ as a function of viewability $V$, for several values of the false-click factor $r$.

Viewability correction factor for various values of r
Correction factor $V + r(1 - V)$ vs. viewability $V$ for $r \in \{0, 0.05, 0.1, 0.2, 0.5\}$.