• Stars
    star
    354
  • Rank 119,297 (Top 3 %)
  • Language
    Python
  • Created over 6 years ago
  • Updated over 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Play with the solutions to the multi-armed-bandit problem.

multi-armed-bandit

This repo is set up for a blog post I wrote on "The Multi-Armed Bandit Problem and Its Solutions".


The result of a small experiment on solving a Bernoulli bandit with K = 10 slot machines, each with a randomly initialized reward probability.

Alt text

  • (Left) The plot of time step vs the cumulative regrets.
  • (Middle) The plot of true reward probability vs estimated probability.
  • (Right) The fraction of each action is picked during the 5000-step run.