• Stars
    star
    4
  • Rank 3,304,323 (Top 66 %)
  • Language
    MATLAB
  • Created almost 8 years ago
  • Updated about 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

These implementatios shows Convergence and performance of policy and value iteration algorithms, how the convergence of these algorithms to the optimal value function depends on the number of iterations used. Furthermore, I have implemented on-policy SARSA and off-policy Q-learning algorithms and showed how the performance of these algorithms depends on the exploration-exploitation tradeoff, and on learning rates. My experiments were evaluted on benchmark reinforcement learning tasks such as a smallworld, gridworld and a cliffworld MDP to analyze the performance of our algorithms.