• Stars
    star
    118
  • Rank 299,923 (Top 6 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Training code for GA3C-CADRL algorithm (collision avoidance with deep RL)

This is the training code for:

Journal Version: M. Everett, Y. Chen, and J. P. How, "Collision Avoidance in Pedestrian-Rich Environments with Deep Reinforcement Learning", in review, Link to Paper

Conference Version: M. Everett, Y. Chen, and J. P. How, "Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning", IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018. Link to Paper, Link to Video

The gym environment code is included as a submodule.

Install

Grab the code from github, initialize submodules, install dependencies and src code

# Clone either through SSH or HTTPS (MIT-ACL users should use GitLab origin)
git clone --recursive [email protected]:mit-acl/rl_collision_avoidance.git

cd rl_collision_avoidance
./install.sh

There are some moderately large (10s of MB) checkpoint files containing network weights that are stored in this repo as Git LFS files. They should automatically be downloaded during the install script.

Train RL (starting with a network initialized through supervised learning on CADRL decisions)

To start a GA3C training run (it should get approx -0.05-0.05 rolling reward to start):

./train.sh TrainPhase1

To load that checkpoint and continue phase 2 of training, update the LOAD_FROM_WANDB_RUN_ID path in Config.py and do:

./train.sh TrainPhase2

By default, the RL checkpoints will be stored in RL_tmp and I think files will get overwritten if you train multiple runs. Instead, I like using wandb as a way of recording experiments/saving network parameters. To enable this, set the self.USE_WANDB flag to be True in Config.py, then checkpoints will be stored in RL/wandb/run-<datetime>-<id>.

To run experiments on AWS

Start a bunch (e.g., 5) of AWS instances -- I used c5.2xlarge because they have 8vCPUs and 16GB RAM (somewhat like my desktop?). Note: this is just an example, won't work out of box for you (has hard-coded paths)

Add the IP addresses into ga3c_cadrl_aws.sh.

./ga3c_cadrl_aws.sh panes
# C-a :setw synchronize-panes -- will let you enter the same command in each instance

Then you can follow the install & train instructions just like normal. When training, it will prompt you for a wandb login (can paste in the authorization code from app.wandb.ai/authorize).

Observed Issues

If on OSX, when running the ./train.sh script, you see:

objc[39391]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[39391]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

just add this ENV_VAR: export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES.

If you find this code useful, please consider citing:

@inproceedings{Everett18_IROS,
  address = {Madrid, Spain},
  author = {Everett, Michael and Chen, Yu Fan and How, Jonathan P.},
  booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  date-modified = {2018-10-03 06:18:08 -0400},
  month = sep,
  title = {Motion Planning Among Dynamic, Decision-Making Agents with Deep Reinforcement Learning},
  year = {2018},
  url = {https://arxiv.org/pdf/1805.01956.pdf},
  bdsk-url-1 = {https://arxiv.org/pdf/1805.01956.pdf}
}

More Repositories

1

faster

3D Trajectory Planner in Unknown Environments
C++
949
star
2

cadrl_ros

ROS package for dynamic obstacle avoidance for ground robots trained with deep RL
Python
572
star
3

mader

Trajectory Planner in Multi-Agent and Dynamic Environments
C++
479
star
4

gym-collision-avoidance

OpenEdge ABL
246
star
5

clipper

graph-theoretic framework for robust pairwise data association
C++
219
star
6

panther

Perception-Aware Trajectory Planner in Dynamic Environments
C++
187
star
7

dpgo

Distributed Pose Graph Optimization
C++
181
star
8

mppi_numba

A GPU implementation of Model Predictive Path Integral (MPPI) control that uses a probabilistic traversability model for planning risk-aware trajectories.
Jupyter Notebook
179
star
9

rmader

Decentralized Multiagent Trajectory Planner Robust to Communication Delay
C++
72
star
10

minvo

Simplexes with Minimum Volume Enclosing Polynomial Curves
MATLAB
71
star
11

aclswarm

MIT ACL distributed formation flying using multirotors
C++
67
star
12

nfl_veripy

Formal Verification of Neural Feedback Loops (NFLs)
Python
63
star
13

dpgo_ros

ROS wrapper for distributed pose graph optimization
C++
59
star
14

deep_panther

C++
51
star
15

clear

CLEAR algorithm for multi-view data association
MATLAB
35
star
16

planning

List of planning algorithms developed at MIT-ACL
34
star
17

puma

PUMA: Fully Decentralized Uncertainty-aware Multiagent Trajectory Planner with Real-time Image Segmentation-based Frame Alignment
C++
27
star
18

fastsam_ros

ROS wrapper for FastSAM, with docker
Python
17
star
19

separator

Linear separability (via planes) of two sets of 3D points
C++
12
star
20

dc2g

Planning Beyond the Sensing Horizon Using a Learned Context
Python
10
star
21

gym-minigrid

Python
10
star
22

SOS-Match

JavaScript
10
star
23

yolov7_ros

ROS wrapper for YOLOv7, with docker
Python
9
star
24

dc2g_public

Deep Cost-to-Go Planning Algorithm (IROS '19)
9
star
25

iscp_path_planner

Iterative sequential convex programming path planner, from Steven and Mark's ICRA 2015 paper
Python
4
star
26

panther_extra

Python
1
star
27

murp-datasets

Jupyter Notebook
1
star
28

motlee

Multiple Object Tracking with Localization Error Elimination
Python
1
star
29

mit-acl.github.io

SCSS
1
star