Are there alternatives to Bayesian optimization

Conversion rate optimization according to Bayesian statistics appeared once in 2013. At that time, Chris Stucchio explained in his article how one can use probability distributions to optimize CTR rates.

We start from an unknown Parameter that can have values ​​between [0.1]. This -Parameter represents the true value - the true conversion rate. The goal is to follow an a posteriori distribution to develop.

The function has the form:


= ”” Graphically = ”” seen = ”” borders = ”” that = ”” interval = ”” [b, a] = ”” one = ”” area = ”” below = ”” des = ”” graph = ” ”The =” ”function =” ”f ().=”” f ():

Now it is possible to determine two essentials:

  • The number of achievable conversion goals for a given interval
  • a and b can be chosen so that the area below the graph is 0.95. We can then make statistically significant statements (e.g. with 95% certainty the true value of the conversion rate is between a and b).
  • Estimate distributions for several alternative courses of action and thus improve the CTR (Best-expected-CTR-heuristic)

Function design using Bayes' theorem

We now want to see the true worth of dedicate. For this we need Bayes' theorem:




Stucchio cleverly chooses the binomial distribution to determine the probability for the event P (E |) appreciate. There is unknown, we can (t) en P (E |) as follows:




How do we now compute P ()? It is now a matter of establishing an estimated value. We are free to choose which type of probability distribution we want choose. The conversion rate is intuitive Not evenly distributed. As a rule, a conversion rate of over 50% in e-commerce is unrealistic. Let us assume that the conversion rate is between 0% and 10%, with the frequency decreasing sharply in height and the function should therefore be designed as an extreme peak and then falling sharply. For the calculation we therefore use a beta distribution with alpha = 1.1 and beta = 30. Chris chose this form in his original article because it reflects the properties described above well.

If we now add the formula of the binomial distribution for P (E |) and our assumed a priori distribution for P (), then after factoring out all constant values ​​one gets without the shape of the posterior distribution. The a posteriori distribution is also a beta distribution, but the alpha and beta values ​​have changed - by exactly these values:




Now we can calculate probability values ​​for conversion intervals on the basis of this formula (N = number of visitors, K = number of conversions).




What is the probability of the real conversion rate being at least 1%?

With the help of such probability values, simulations and estimates can be made about the possible number of conversions. As the number of samples increases, the variance of the distributions decreases and the estimates become more precise. If you compare the estimated CTRs of several alternatives, you create a continuous optimization cluster. The option that has the highest expected CTR must always be selected.