There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
Python code for a basic RL solution for the Non-stationary (action value function changes with time) k-arm bandit problem. Based on the book "Reinforcement learning: An introduction" by S.Sutton and Andrew G. Barto