Auction Ads Delivery as a Constrained Optimization Problem

Honghao Yu
7 min readJun 28, 2020

--

In its simplest form, auction ads delivery can be regarded as a classic constrained optimization problem. There is a well-defined objective function (can be multiple functions under some circumstances) and a whole bunch of constraints. What we need to do is to maximize (or minimize) the objective function with the constraints being satisfied.

We shall first look into the objective function. The ultimate goal of advertising, be it online or offline, is to efficiently connect ads with people who are really interested in the products/services being advertised and are willing to take the desired actions like downloads or purchases. This can be summarized as user reach with high efficiency, low cost and constant predictability, where high efficiency can refer to the accuracy of shows (shown to those who are highly likely to be converted) and time-efficiency (advertisers can obtain the desired results within the time limit), low cost means that the average cost per desired action is reasonable (when compared to other channels/platforms), while predictability means that the real effects of an advertising campaign match our expectations. Based on the billing types chosen in real-life campaigns, we have defined some metrics including CTR (Click-Through Rate), CVR(Conversion Rate), etc., to measure the efficiency of ads delivery. They could be expressed in two forms: predictions (obtained via prediction models prior to actual delivery) and posteriori values (real results). What we want in the end is to increase the posteriori CTR/CVR as much as possible, as high values indicate efficient matching between ads and users. Given a certain amount of users and ads, all those users who are likely to be converted are converted, which means the advertisers haven’t paid for the wrong audience but indeed have used their entire budget to gain desired actions.

These posteriori percentages, however, are merely a summary of the overall matching quality. When it comes to individual users, there are only two possibilities: either to take the desired action or not. It’s no longer a problem of percentage prediction, but rather a problem of classification. Here we need to calculate the possibility of conversion based on the behavioral patterns of the group where the user belongs and the characteristics of the ads in order to label the user as whether or not will be converted. If we over-estimate the probability, we will show the ads to someone who will not be converted; if we under-estimate it, then we will miss someone who could have been converted. The former will surely lower the posteriori values, whereas the latter could avoid so only when the missed show is replaced by one that ultimately converts a user (meaning every single show should lead to a conversion, which is unrealistic). Therefore, the problem to be solved evolves into increasing the accuracy of binary classification, or making fewer mistakes when judging whether a given ad could convert a given user.

Nonetheless, the problem does not just stop here. Thought it does provide a basis for our reasoning, the challenge we’re faced with in real-world delivery is to precisely match millions of users and a great many ads in the same time slot with the goal of increasing profits as much as possible. If there is sufficient traffic on our media enough for every ad to get the exposure they need without competition, then we would only need to make right classifications; but if the traffic is scarce, the ads would have to compete against each other to gain enough exposure/shows and the problem becomes much more complicated.

Fortunately, the challenge also comes with large space for margins. Like in any other market, we have introduced the price mechanism to let the ads compete for limited traffic in the way of auction so that we can exploit the advertisers’ willingness to pay. To further demonstrate the idea, we need to introduce the term of ECPM, a multiple of autobid (jointly determined by the cpa_bid, the budget and the system’s bidding mechanism) and the predicted CTR/CVR. We rank the ads over their expected revenues (given by ECPM) on a given user and choose the top ones to show. Such mechanism could foster precise matching between ads and users as well as encourage advertisers to bid high and/or increase their budget hence boost our revenues. Autobid aside, we now could no longer be satisfied with accurate classification, since accurate predictions of CTR/CVR are also needed to obtain a trustworthy ranking of all the ads based on ECPM in each auction. For example, predicted CTRs of 0.6 and 0.9 would lead to the same classification results (any value above the threshold 0.5 can classify the user as going to click), but probably quite different results in the ECPM ranking. In this way, inaccurate predictions can have huge impact on actual delivery. As a result, our goal changes into predicting CTR/CVR as accurately as possible, or in other words, minimizing the difference between predictions and posteriori values (Diff(pCTR/pCVR, CTR/CVR)).

Next come the constraints. We need to maintain a balance among users, the platform and clients in order to continuously monetize on the ads. The main constraints accordingly come from user experience, client experience and platform’s maintenance costs. Let’s start from the platform side. Training deep models on terabytes of data consumes a tremendous quantity of cluster resources. Sometimes it takes much extra hardware and electricity consumption to slightly increase the accuracy of predictions. The problem under such circumstances is how to choose a relatively light model to achieve the global optimum given a certain amount of cluster resources.

Looking into the user side, we can easily figure out that the user experience will be deteriorated mostly by excessive frequency or low quality of ad shows. We need to have a good control of the frequency that users see an ad (the number of posts between two ads) which mainly impacts the effective traffic supply. The frequency of ad shows not only implies the exposure of different ads but also that of the same ad, since the conversion effect of continuous exposure of the same ad would die down as indicated by the universal law of decreasing marginal utility. We could reasonably assume that there should be an upper bound for the summed number of shows of all the ads as well as one for the total shows of each individual ad. Putting a quantifiable limit may seem straightforward for frequency, but not so much for quality. Even a human being would say there are too many aspects to consider when rating the quality of a video ad, let alone machines and algorithms. To accomplish so requires the system to analyze and quantify the image quality, the content, the soundtracks, etc. The current practice of many platforms is to give a hidden cost or quality score and include it in the ECPM ranking. After quantifying the quality of each ad, we need to guarantee that the average score or the lowest acceptable score is above some threshold depending on our strategy.

The final part is about the client side. As aforementioned, the advertisers expect user reach with high efficiency, low cost and constant predictability. The advertisers and the platform are closely aligned in terms of high efficiency and predictability because these mean a larger pie and both sides could benefit. Lower cost, however, is mainly the pursuit of the advertisers and is in fact against the interests of the platform to some extent (the platform pursues a higher final cost which equals Cost Per Action times the number of desired actions, thus the higher the CPA the better). Too high a CPA would deter the clients in advertising on our platform which harms our revenues, hence we seek to increase the budget usage ratio by gaining more desired actions while keeping the CPA within a range that the clients could tolerate. A formula would help us better understand this relationship: revenues of the platform = the real costs of the advertisers = advertisers’ budget * budget usage ratio, where the advertisers’ budget is negatively related to CPA, the budget usage ratio is positively related to the accuracy of model predictions, whereas the CPA is negative related to prediction accuracy and positively related to autobid. Consequently, the focus of the platform is to increase the accuracy of predictions (the model side) as much as possible while keeping the autobid (the bidding side) relatively stable yet competitive. On the other hand, although both the platform and the advertisers desire predictability, the emphasis of the platform lies on whether the budget usage ratio matches expectation, whereas the advertisers care about not only budget usage but also the rate at which the budget is spent. For many advertisers, the predictability of how the money is spent over the lifespan of a whole campaign is equally crucial.

The constrained optimization problem described above can be formulated as below:

Where diff(predictions, results) is the difference between predictions and posteriori values, GivenResources is the cluster resources currently available, a1 is the frequency limit, NumActiveUsers is the volume of traffic at a given time, n1 is the frequency limit at the individual user level, b is the lower threshold of the average quality of the ads, CPA_aid is the CPA at the ad group level, ExpectedCPA_advid is the upper threshold of CPA that a given advertiser could tolerate, sigma is the upper limit of the variance of budget usage ratio over time.

--

--

Honghao Yu
Honghao Yu

Written by Honghao Yu

Product Manager at TikTok | MSc. Data Science for Business at École Polytechnique & HEC Paris

No responses yet