Bayesian Optimization

Bayesian Optimization: A Suite of Techniques for Hyperparameter Tuning

In this article, we will explore Bayesian Optimization, a suite of techniques used to tune hyperparameters in machine learning algorithms. We will also discuss its relation to Active Learning and other acquisition functions.

The Gold Mining Problem

Let’s consider a classic problem: gold mining. We want to find the location with the maximum gold content in an unknown land. We can’t drill at every location due to the high cost. We need to minimize the number of drillings while finding the location of maximum gold quickly.

Active Learning

Active Learning is a technique used to minimize labeling costs while maximizing modeling accuracy. In our gold mining problem, we can use Active Learning to estimate the true gold distribution. We can start with a prior assumption of a smoothly distributed gold content and update our model as we drill at different locations.

Bayesian Optimization

In Bayesian Optimization, our goal is to find the location of maximum gold content. We don’t need to estimate the true gold distribution accurately. We can use an acquisition function to determine which point to evaluate next. Acquisition functions are heuristics that balance exploration and exploitation.

Acquisition Functions

We have seen various acquisition functions, including:

Expected Improvement (EI): This function balances exploration and exploitation by considering the expected improvement at each point.
Probability of Improvement (PI): This function focuses on the probability of improvement at each point.
Thompson Sampling: This function samples a function from the surrogate posterior and optimizes it.
Upper Confidence Bound (UCB): This function combines the mean and uncertainty of the surrogate model.

Thompson Sampling

Thompson Sampling is another acquisition function that samples a function from the surrogate posterior and optimizes it. It balances exploration and exploitation by considering the uncertainty of the surrogate model.

Other Acquisition Functions

We can combine existing acquisition functions to overcome their limitations. For example, we can combine Probability of Improvement and Expected Improvement to form a linear combination.

Conclusion

In this article, we have explored Bayesian Optimization, a suite of techniques used to tune hyperparameters in machine learning algorithms. We have discussed its relation to Active Learning and other acquisition functions. We have seen various acquisition functions, including Expected Improvement, Probability of Improvement, Thompson Sampling, and Upper Confidence Bound. We have also discussed the importance of balancing exploration and exploitation in Bayesian Optimization.

References

Rasmussen, C. E., & Williams, C. K. I. (2004). Gaussian processes for machine learning. MIT Press.
scikit-learn, Gaussian Processes.
Javier González, Gaussian Processes.
Distill, Gaussian Processes.

Post Views: 37

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Generate single title from this title Nvidia shares fall as blockbuster results fail to dazzle in 100 -150 characters. And it must return only...

Generate single title from this title It exposed what was already broken in 100 -150 characters. And it must return only title i dont...

What is a Performance Review + Definition?

LEAVE A REPLY Cancel reply

Latest

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Categories

Useful Links

Our Newsletter