Discover ptyshevs/walking_marvin Open Source project

Fast Evolution Strategy for Walking Marvin

This is a design doc for the implementation that I've come up with.

Install Guide

Create a virtual environment python3 -m venv marvin_env
Activate it source marvin_env/bin/activate
Install Swig library brew install swig.
pip install numpy==1.17.2 gym==0.14.0 Box2D==2.3.2 box2d-py==2.3.8
Copy gym directory provided in this repo to marvin_env/lib/python3.7/site-packages (with replacement, like cp -r gym marvin_env/lib/python3.7/site_packages)
import gym env = gym.make("Marvin-v0") to create an environment
Other environments should work fine too env = gym.make("BipedalWalker-v2)"

In order to run distributed version you need Ray: pip install ray psutil

If you encounter an error, contact me. It's likely that this will break in the future due to dependencies.

Server

The purpose of Server is to synchronize progress across multiple Clients as well as distribute work to each of the Client. It does so by creating a list of Client actors, initializing them with model architecture, random seed used for model initialization, seed for perturbation generation, and environment identifier.

Client

Client is initialized with it's personal random seed that is known for Server. When evaluate method is called, it samples weights perturbation according to it's seed and evaluates model with it, sending only the reward back to Server.

Client can run evaluate multiple times with perturbation added to the same set of weights.

Once Server is done distributing evaluation across Clients, it collects the rewards and reproduces perturbations on the client nodes. It then proceeds with performing weights update according with the Evolution Strategy. It then broadcasts new weights across all clients by calling update method.

ptyshevs/walking_marvin

ptyshevs

Reviews

Repository Details

Fast Evolution Strategy for Walking Marvin

Install Guide

Server

Client

Bibliography

More Repositories