There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
In the Pick-up and Drop-off (PD) World, our goal is to design a route from the agent so that it could use the least steps to send all the blocks to drop-off cells. To solve reinforcement learning problems, we use a statistical approach and dynamic programming, especially Q-learning, to estimate the utility of taking actions in the states of the world. The given system setup is run for 6 different experiments that use different learning rates and policies.