AI Glossary: What Is Thompson Sampling (TS)? Definition & Meaning

Thompson Sampling ist eine statistische Technik, die im Bereich der maschinellem Lernen and Entscheidungsfindung unter Unsicherheit. It is particularly useful in situations where an individual or algorithm must choose between multiple options, each with unknown rewards. The core idea behind Thompson Sampling is to model the uncertainty der Belohnungen für jede Option und um Entscheidungen auf Basis dieser Modelle zu treffen.

Die Technik basiert auf dem Prinzip des Bayesianische Schlussfolgerung. It assumes that the true reward distribution for each option can be represented by a probability distribution, often modeled as a Beta distribution in the case of binary outcomes. At each decision point, Thompson Sampling samples from the distributions of each option to estimate their expected rewards. The option with the highest sampled value is then chosen.

This method effectively balances two strategies: exploration (trying out less certain options to gather more information) and exploitation (selecting the option that currently seems the best based on available information). By continuously updating the Wahrscheinlichkeitsverteilungen as new data is collected, Thompson Sampling can adaptively improve its decision-making over time.

Thompson Sampling is widely used in various applications, including online advertising, clinical trials, and Empfehlungssystemen. Its efficiency and effectiveness have made it a popular choice for solving multi-armed bandit problems—a scenario where a gambler must choose from multiple slot machines with unknown payout rates.

Insgesamt ist Thompson Sampling ein leistungsfähiges Werkzeug zur Optimierung von Entscheidungen in unsicheren Umgebungen, das bessere langfristige Ergebnisse ermöglicht, indem es klug die Notwendigkeit ausbalanciert, neue Möglichkeiten zu erkunden, während bekannte Belohnungen genutzt werden.