Bayesian Search

Min Jean Cho

Suppose you have lost a key in your house and want to find it as soon as possible. Let’s first demarcate search area to be searched (e.g., compartments \(1,2,3,...,K\)). Two simple search strategies are grid search (e.g., \(1 \rightarrow 2 \rightarrow 3 \rightarrow ... \rightarrow K\)) and random search (e.g., \(7 \rightarrow 3 \rightarrow 5 \rightarrow ... \rightarrow K\)). But, you probably won’t use these search strategies. Rather, you would prefer to use clues helping your search. If you know that you stayed at your room for a long time but did not stay in the living before you lost the key, you will not search the living room but search your room for a long time. This method is called Bayesian search in that you use ’prior information.’

Note that grid search and random search are basically the same method because a particular sequence of compartments is one of many possible random sequences. In real situations, the search problem becomes more complex because it is not always guaranteed to find the object in the compartment where the object is located, resulting in false negative search result. For example, if the key is in a very disordered room, the likelihood you can find the key in this room could be very low. Thus, in real situations, we need to re-visit the compartments we have already searched until we find the object. Using prior probability and likelihood values (true positive rate, false negative rate, true negative rate and false negative rate), Bayesian search method calculates the posterior probabilities of finding the lost object in compartments and tells us the most probable compartment \(j\) when we have searched for but not found the the object in compartment \(i\).

Bayesian search method has been used for many purposes. A well-known example is the search for a US submarine named USS Scorpion that was lost and found in 1968. Including the lost object search, Bayesian search method can be applied for various areas where pinpointing a target (with trial-and-errors) is required such as optimization of hyperparameters of deep learning models. I will explain the basic algorithm of Bayesian search method using the submarine example. First construct a grid on the whole area to be searched and assign indices to compartments as follows.

Prior

The existence of the submarine is represented as a Bernoulli random variable \(Y\) (1 for existence).

\[Y \sim \text{Bernoulli}(\pi)\]

\[P(Y=y)=\pi^{y}(1-\pi)^{1-y}, \text{ where } y \in \{0,1\}\]

Thus, the probability that the submarine is in compartment \(i\) is as follows:

\[\pi_i = P(Y_i = 1)\]

We will use \(\pi_i\) as our prior on the compartment \(i\). We can set the value of \(\pi_i\) by collecting prior information such as an expected route of the submarine or the experience of other submarine’s officers and crew members.

Likelihood

The event that we find the submarine is also a Bernoulli random variable, which is denoted by \(Z\) (1 for detection). As mentioned earlier, even if we searched compartment \(i\) where the submarine is located, it is possible that we cannot detect the submarine because of some technical difficulties such as the detector’s limit, complexity of underwater topography, weather condition. Let’s first define the probability of finding the submarine in the compartment \(i\) where the submarine is located.

\[p_i = P(Z_i=1|Y_i=1)\]

Note that \(p\) is a likelihood (true positive rate) because \(Z\) represents the detection result (data) we observed and \(Y\) represents the unknown variable we want to find its true value of. \(p\) can be calculated by taking into account the technical difficulties mentioned above. Once \(p\) (true positive rate) is determined for each compartment, we can easily determine other likelihood values (false negative rate, true negative rate and false negative rate). The probability that we miss the submarine in compartment \(i\) where the submarine is located is simply as follows:

\[P(Z_i=0|Y_i=1) = 1-p_i\]

Also, if the submarine is not in compartment \(i\), we never find the submarine in compartment \(i\). That is,

\[P(Z_i=1|Y_i=0)=0\]

\[P(Z_i=0|Y_i=0)=1\]

Posterior

We have so far defined the priors and likelihoods. Now, think of posterior. Suppose that we have chosen the first search compartment using some procedure. If we find the submarine in the first compartment we visit, the search operation is completed at the first time. If not, we need to determine the next search compartment. This means that we need to calculate the posterior probability that the submarine is located in compartment \(j\) given that we have not detected the submarine in the compartment \(i\).

\[P(Y_j=1|Z_i=0)\]

Note that \(j\) can be \(i\), which means we need to search compartment \(i\) again. Let’s calculate the posterior for the case where \(j=i\).

\[\begin{aligned} P(Y_i=1|Z_i=0) &= \frac{P(Z_i=0|Y_i=1)P(Y_i=1)}{P(Z_i=0)} \\ &= \frac{P(Z_i=0|Y_i=1)P(Y_i=1)}{P(Z_i=0|Y_i=1)P(Y_i=1) + P(Z_i=0|Y_i=0)P(Y_i=0)} \\ &= \frac{(1-p_i)\pi_i}{(1-p_i)\pi_i + 1 \cdot (1-\pi_i)} \\ &= \frac{\pi_i(1-p_i)}{\pi_i - p_i\pi_i + 1 - \pi_i} \\ &= \frac{\pi_i(1-p_i)}{1-p_i\pi_i} \end{aligned}\]

Recall that the Bayesian inference is an update of prior. Thus, we can re-write the posterior as follows:

\[\pi_i^{\text{new}} = \frac{ \pi_i^{\text{old}}(1-p_i) }{ 1- p_i \pi_i^{\text{old}}}\]

Now think of the case where \(j \ne i\).

\[\begin{aligned} P(Y_{j \ne i}=1|Z_i=0) &= \frac{P(Z_i=0|Y_{j \ne i}=1)P(Y_{j \ne i} = 1)}{P(Z_i=0)} \\ &= \frac{P(Z_i=0|Y_{j \ne i}=1)P(Y_{j \ne i} = 1)}{P(Z_i=0|Y_{j \ne i}=1)P(Y_{j \ne i} = 1) + P(Z_i=0|Y_{j \ne i}=0)P(Y_{j \ne i} = 0)} \end{aligned}\]

It is obvious that we will never find the submarine in the compartment \(i\) if the submarine is located in other compartment \(j \ne i\), that is

\[P(Z_i=0|Y_{j \ne I}=1) =1\]

\[P(Z_i=1|Y_{j \ne I}=1) =0\]

Although we don’t know the value of \(P(Z_i=0|Y_{j\ne i}=0)\), the marginal likelihood \(P(Z_i=0)\) can be easily calculated as follows.

\[\begin{aligned} P(Z_i=0) &= P(Z_i=0|Y_j=0)P(Y_j=0) + P(Y_j=1) \\ &= \left[1-P(Z_i=1|Y_j=0)\right]P(Y_j=0)+P(Y_j=1)\\ &= P(Y_j=0) - P(Z_i=1|Y_j=0)P(Y_j=0) + P(Y_j=1) \\ &= 1 - P(Y_j=1) - P(Z_i=1|Y_j=0)P(Y_j=0) + P(Y_j=1) \\ &= 1 - P(Z_i=1|Y_j=0)P(Y_j=0) \\ &= 1 - P(Z_i=1,Y_j=0) \\ &= 1 - P(Y_j=0|Z_i=1)P(Z_i=1) \\ &= 1 - P(Z_i=1) \text{ } \because P(Y_j=0|Z_i=1)=1 \\ &= 1 - \left[ P(Z_i=1,Y_i=0) + P(Z_i=1,Y_i=1) \right] \\ &= 1 - \left[ P(Z_i=1|Y_i=0)P(Y_i=0) + P(Z_i=1,Y_i=1)P(Y_i=1) \right] \\ &= 1 - P(Z_i=1,Y_i=1)P(Y_i=1) \text{ } \because P(Z_i=1|Y_i=0)=0 \\ &= 1 - p_i\pi_i \end{aligned}\]

Thus,

\[\begin{aligned} P(Y_{j \ne i}|Z_i = 0) &= \frac{P(Z_i=0|Y_{j\ne i}=1)P(Y_{j \ne i}=1)}{P(Z_i=0)} \\ &= \frac{1\cdot \pi_{j\ne i}}{1- p_i\pi_i} \end{aligned}\]

Then, in a Bayesian update form,

\[\pi_{j\ne i}^{\text{new}} = \frac{\pi_{j\ne i}^{\text{old}}}{1-p_i\pi_{i}^{\text{old}}}\]

In practice, Bayesian search is performed with \(t = 0,1,2,...,T\) iterations. At \(t=0\), set priors \(\pi_i^{(0)}\) such that \(\sum_{i=1}^{K} \pi_i ^{(0)} = 1\) and calculate likelihood \(p_i = P(Z_i=1|Y_i=1)\) based on the technical difficulties (in sophisticate algorithm, likelihood can be varied during iteration). Then, at \(t=1\), calculate posteriors and determine the compartment (\(v\)) to be searched at the first iteration.

\[\pi_i^{(1)} = \frac{\pi_i^{(0)} (1-p_i) }{ 1 - p_i \pi_i^{(0)} }\]

\[v^{(1)} = \text{arg max}_i \pi_i^{(1)}\]

If found at \(t=1\), the search is done. If not, calculate the posteriors \(\pi_i^{(t+1)}\) again using previous posterior \(\pi_i^{(t)}\) as priors and determine the compartment (\(v\)) to be searched the next time.

\[\pi_i^{(t+1)} = \frac{\pi_i^{(t)} (1-p_i) }{ 1 - p_i \pi_i^{(t)} }, \pi_{j \ne i}^{(t+1)} = \frac{\pi_{j \ne i}^{(t)}}{ 1 - p_i \pi_i^{(t)} }\]

\[v^{(t+1)} = \text{arg max}_i \pi_i^{(t+1)}\]