Conference Paper

On the Suboptimality of Thompson Sampling in High Dimensions

Thompson Sampling (TS) for combinatorial semi-bandits is sub-optimal in the sense that its regret scales exponentially in the ambient dimension for some well chosen exemples.

Dec 7, 2021