• Stars
    star
    1
  • Language
    Jupyter Notebook
  • Created over 5 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An implementation of Temporal-Difference methods (Sarsa, Q-learning, Expected Sarsa) for estimating action-value function and optimal policy to play Cliff Walking continuous task of OpenAI.