• Stars
    star
    163
  • Rank 231,141 (Top 5 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 3 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Train kuka robot reach a point with deep rl.

kuka-reach-drl

Train kuka robot reach a point with deep rl in pybullet.

  • NOTE: The main brach is trained with spinup, and there are some issues with gpu and multi core CPUs at the same time, so this brach will be deprecated in the future. The rllib branch is trained with ray/rllib, and this branch will be mainly used in the future.
  • The main branch will not update for a while, the rllib brach is the newest
The train process with mlp The evaluate process with mlp train plot
The train process with cnn The evaluate process with cnn train plot

Installation guide (Now only support linux and macos)

I strongly recommend using Conda to install the env, because you will possible encounter the mpi4py error with pip.

The spinningup rl library is the necessary lib. first, you should install miniconda or anaconda. second, install some dev dependencies.

sudo apt-get update && sudo apt-get install libopenmpi-dev
sudo apt install libgl1-mesa-glx

third, create a conda virtual environment

conda create -n spinningup python=3.6   #python 3.6 is recommended
#activate the env
conda activate spinningup

then, install spiningup,is contains almost dependencies

# clone my version, I made some changes.
git clone https://github.com/borninfreedom/spinningup.git
cd spinningup
pip install -e .

last, install torch and torchvision.

if you have a gpu, please run this (conda will install a correct version of cudatoolkit and cudnn in the virtual env, so don't care which version you have installed in your machine.)

# CUDA 10.1
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

if you only have a cpu, please run this,

# CPU Only
conda install pytorch==1.4.0 torchvision==0.5.0 cpuonly -c pytorch

Alternative installation method

Or, you can create the virtual environment directly through

conda create --name spinningup --file requirements.txt

but I can not ensure this method can success.

Run instruction

if you want to train the kuka with coordition env, whose input to policy is the coordition of the target pos, and the actor critic framework is based on mlp, please run

python train_with_mlp.py --is_render  --is_good_view  --cpu 5 --epochs 100

if you don't want to view the scene, just train it, run

python train_with_mlp.py  --cpu 5 --epochs 100

if you want to train kuka with image input and cnn model,run

python train_with_cnn.py --is_render  --is_good_view  --cpu 5 --epochs 500

if you don't want to view the scene, just train it, run

python train_with_cnn.py  --cpu 5 --epochs 500

if you want to train kuka with image input and lstm model,run

python train_with_lstm.py --is_render  --is_good_view  --cpu 5 --epochs 500

if you don't want to view the scene, just train it, run

python train_with_lstm.py --cpu 5 --epochs 500

Files guide

the train.py file is the main train file, you can directly run it or through python train.py --cpu 6 to run it in terminal. Please notice the parameters.

eval.py file is the evaluate trained model file, the model is in the logs directory named model.pt. In the eval file, pybullet render is open default. When you want to evaluate my trained model, please change the source code ac=torch.load("logs/ppo-kuka-reach/ppo-kuka-reach_s0/pyt_save/model.pt") to ac=torch.load("saved_model/model.pt") in eval.py

ppo directory is the main algorithms about ppo.

env directory is the main pybullet env.

view the train results through plot

python -m spinup.run plot ./logs

More detailed information please visit plotting results

Resources about deep rl reach and grasp.

Articles

Source codes

Machine learning and reinforcement learning knowledges

Robotics knowledge

Python Knowledge

logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(threadName)s - %(pathname)s[line:%(lineno)d] - %(levelname)s: %(message)s',
    filename='./logs/client1-{}.log'.format(time.strftime("%Y_%m_%d_%H_%M_%S", time.localtime())),
    filemode='w')
logger = logging.getLogger(__name__)

formatter = logging.Formatter('%(asctime)s - %(threadName)s - %(pathname)s[line:%(lineno)d] - %(levelname)s: %(message)s')
stream_handler = logging.StreamHandler()

stream_handler.setLevel(logging.INFO)
stream_handler.setFormatter(formatter)
logger.addHandler(stream_handler)

# in the codes.
# logger.info()
# logger.debug()

Blogs about deep rl written by me

  1. Ubuntu助手 — 一键自动安装软件,一键进行系统配置
  2. 深度强化学习专栏 —— 1.研究现状
  3. 深度强化学习专栏 —— 2.手撕DQN算法实现CartPole控制
  4. 深度强化学习专栏 —— 3.实现一阶倒立摆
  5. 深度强化学习专栏 —— 4. 使用ray做分布式计算
  6. 深度强化学习专栏 —— 5. 使用ray的tune组件优化强化学习算法的超参数
  7. 深度强化学习专栏 —— 6. 使用RLLib和ray进行强化学习训练
  8. 深度强化学习专栏 —— 7. 实现机械臂reach某点之PPO算法实现(一)
  9. 深度强化学习专栏 —— 8. 实现机械臂reach某点之PPO算法实现(二)
  10. 深度强化学习专栏 —— 9. 实现机械臂reach某点之PPO算法实现(三)
  11. 深度强化学习专栏 —— 10. 实现机械臂reach某点之环境实现实现

Blogs about pybullet written by me

  1. pybullet杂谈 :使用深度学习拟合相机坐标系与世界坐标系坐标变换关系(一)
  2. pybullet杂谈 :使用深度学习拟合相机坐标系与世界坐标系坐标变换关系(二)
  3. pybullet电机控制总结
  4. Part 1 - 自定义gym环境
  5. Part 1.1 - 注册自定义Gym环境
  6. Part 1.2 - 实现一个井字棋游戏的gym环境
  7. Part 1.3 - 熟悉PyBullet
  8. Part 1.4 - 为PyBullet创建Gym环境

Some resources about how to implement the RL to real robots

Source codes

Papers

Blogs

Some comments from Facebook groups

0 1 2 3

VSCode tricks

About python extensions

Resolve a.py in A folder import b.py in B folder

  • Add the codes below at the top of a .py file
import os,inspect
current_dir=os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
os.chdir(current_dir)
import sys
sys.path.append('../')

Add header template in .py files

  • Select FIle -> Preference -> User Snippets -> 选择python文件
  • Add the codes below
{
	// Place your snippets for python here. Each snippet is defined under a snippet name and has a prefix, body and 
	// description. The prefix is what is used to trigger the snippet and the body will be expanded and inserted. Possible variables are:
	// $1, $2 for tab stops, $0 for the final cursor position, and ${1:label}, ${2:another} for placeholders. Placeholders with the 
	// same ids are connected.
	// Example:
	// "Print to console": {
	// 	"prefix": "log",
	// 	"body": [
	// 		"console.log('$1');",
	// 		"$2"
	// 	],
	// 	"description": "Log output to console"
	// }


	
	"HEADER":{
		"prefix": "header",
		"body": [
		"#!/usr/bin/env python3",
		"# -*- encoding: utf-8 -*-",
		"'''",
		"@File    :   $TM_FILENAME",
		"@Time    :   $CURRENT_YEAR/$CURRENT_MONTH/$CURRENT_DATE $CURRENT_HOUR:$CURRENT_MINUTE:$CURRENT_SECOND",
		"@Author  :   Yan Wen ",
		"@Version :   1.0",
		"@Contact :   [email protected]",
		"@Desc    :   None",
		
		"'''",
		"",
		"# here put the import lib",
		"$1"
	],
	}	
}

People in relative projects

Details about RL

  • 强化学习中的CNN一般没有池化层,池化层会让你获得平移不变性,即网络对图像中对象的位置变得不敏感。这对于 ImageNet 这样的分类任务来说是有意义的,但游戏中位置对潜在的奖励至关重要,我们不希望丢失这些信息。
  • 经验回放的动机是:①深度神经网络作为有监督学习模型,要求数据满足独立同分布;②通过强化学习采集的数据之间存在着关联性,利用这些数据进行顺序训练,神经网络表现不稳定,而经验回放可以打破数据间的关联。