• Stars
    star
    223
  • Rank 178,458 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created about 1 year ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official implementation for SIGGRAPH 2023 paper "Learning Physically Simulated Tennis Skills from Broadcast Videos"

Learning Physically Simulated Tennis Skills from Broadcast Videos

Haotian zhang, Ye Yuan, Viktor Makoviychuk, Yunrong Guo, Sanja Fidler, Xue Bin Peng, Kayvon Fatahalian

SIGGRAPH 2023 (best paper honorable mention)

Paper | Project | Video

Note: the current release provides the implementation of the hierarchical controller, including the low-level imitation policy, motion embedding and the high-level planning policy, as well as the environment setup in IsaacGym. Unfortunately, the demo can NOT run because the trained models are currently not available due to license issues.

News

[2023/11/28] Training code for the low-level policy is released.

[2023/11/01] Demo code for the hierarchical controller is released.

Environment setup

1. Download IsaacGym and create python virtual env

You can download IsaacGym Preview Release 4 from the official site. Then download Miniconda3 from here. Create a conda virtual env named rlgpu by running create_conda_env_rlgpu.sh from IsaacGym, either python3.7 or python3.8 works. Note you might need to run the following command or add it to your .bashrc if you encounter the error ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory when running IsaacGym.

export LD_LIBRARY_PATH=<YOUR CONDA PATH>envs/rlgpu/lib/

2. Install dependencies

Enter the created virtual env and run the install script.

conda activate rlgpu
bash install.sh

To install additional dependencies for the low-level policy, follow the instructions in install_embodied_pose.sh.

3. Install smpl_visualizer for visualizing results

Git clone and then run

bash install.sh

4. Download data/checkpoints

Download data into vid2player3d/data.

Download checkpoints of motion embedding and trained polices into vid2player3d/results (currently unavailable).

Download SMPL by first registering here and then download the models (male and female models) into smpl_visualizer/data/smpl and rename the files as SMPL_MALE.pkl and SMPL_FEMALE.pkl.

For training the low-level policy, also copy the smpl model files into vid2player3d/data/smpl.

Demo

These demos require trained models which are currently unavailable.

Single player

In the single player setting, the player will react to consecutive incoming tennis balls from the other side. The script below runs the simulation and renders the result online. The simulation will be reset after 300 frames. You can change the player by chaning--cfg to djokovic or nadal.

python vid2player/run.py --cfg federer --rl_device cuda:0 --test --num_envs 1 --episode_length 300 --seed 0 --checkpoint latest --enable_shadow

The script below will run the simulations in batch and render the result videos offline and saved into out/video. You can also change --record to --record_scenepic, which will save the result into an interactive html file under out/html. Note that the saved html file is large and may take seconds to load.

python vid2player/run.py --cfg federer --rl_device cuda:0 --test --num_envs 8192 --episode_length 300 --seed 0 --checkpoint latest --select_best --enable_shadow --num_rec_frames 300 --num_eg 5 --record --headless

Dual player

In the dual player setting, the two players will play tennis rally against each other. The script below runs the simulation and renders the result online. The simulation will be reset if the ball is missed or out. You can change the players by changing --cfg to nadal_djokovic. More player settings will be added soon.

python vid2player/run.py --cfg federer_djokovic --rl_device cuda:0 --test --num_envs 2 --episode_length 10000 --seed 0 --checkpoint latest --enable_shadow

The script below will run the simulations in batch and render the result videos offline and saved into out/video.

python vid2player/run.py --cfg federer_djokovic --rl_device cuda:0 --test --num_envs 8192 --episode_length 10000 --seed 0 --checkpoint latest --enable_shadow --headless --num_rec_frames 600 --num_eg 5 --record

Training

Low-level policy

We provide the code for training the low-level policy in embodied_pose. As described in the paper, the low-level policy is trained in two stages using AMASS motions and tennis motions. You can run the following script to execute the two-stage training (assuming the motion data are available).

python embodied_pose/run.py --cfg amass_im --rl_device cuda:0 --headless
python embodied_pose/run.py --cfg djokovic_im --rl_device cuda:0 --headless

convert_amass_isaac.py shows how to convert the AMASS motion dataset into the format that can be used for our training code.

Motion embedding

We provide code for training the motion embedding in vid2player/motion_vae (assuming the motion data is organized in the format described in Video3DPoseDataset).

High-level policy

We also provide code for training the high-level policy in vid2player. As described in the paper, we design a curriculum trained in three stages. You can run the following script to execute the curriculum training (assuming the checkpoints for the low-leve policy and motion embedding are available).

python vid2player/run.py --cfg federer_train_stage_1 --rl_device cuda:0 --headless
python vid2player/run.py --cfg federer_train_stage_2 --rl_device cuda:0 --headless
python vid2player/run.py --cfg federer_train_stage_3 --rl_device cuda:0 --headless

Citation

@article{
  zhang2023vid2player3d,
  author = {Zhang, Haotian and Yuan, Ye and Makoviychuk, Viktor and Guo, Yunrong and Fidler, Sanja and Peng, Xue Bin and Fatahalian, Kayvon},
  title = {Learning Physically Simulated Tennis Skills from Broadcast Videos},
  journal = {ACM Trans. Graph.},
  issue_date = {August 2023},
  numpages = {14},
  doi = {10.1145/3592408},
  publisher = {ACM},
  address = {New York, NY, USA},
  keywords = {physics-based character animation, imitation learning, reinforcement learning},
}

References

This repository is built on top of the following repositories:

Here are additional references for reproducing the video annotation pipeline:

  • Player detection and tracking: Yolo4
  • 2D Pose keypoint detection: ViTPose
  • 3D Pose estimation and mesh recovery: HybrIK
  • 2D foot contact detection
  • Global root trajectory optimization: GLAMR
  • Tennis court line detection

Contact

For any question regarding this project, please contact Haotian Zhang via [email protected].

More Repositories

1

GET3D

Python
4,208
star
2

lift-splat-shoot

Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D (ECCV 2020)
Python
986
star
3

GSCNN

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)
Python
916
star
4

nglod

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes (CVPR 2021 Oral)
Python
857
star
5

LION

Latent Point Diffusion Models for 3D Shape Generation
Python
754
star
6

NKSR

[CVPR 2023 Highlight] Neural Kernel Surface Reconstruction
Python
751
star
7

ASE

Python
745
star
8

DIB-R

Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer (NeurIPS 2019)
Python
655
star
9

editGAN_release

Python
629
star
10

FlexiCubes

Python
588
star
11

STEAL

STEAL - Learning Semantic Boundaries from Noisy Annotations (CVPR 2019)
Jupyter Notebook
477
star
12

datasetGAN_release

Python
340
star
13

ATISS

Code for "ATISS: Autoregressive Transformers for Indoor Scene Synthesis", NeurIPS 2021
Python
255
star
14

XCube

[CVPR 2024 Highlight] XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies
Python
240
star
15

vqad

225
star
16

GameGAN_code

Learning to Simulate Dynamic Environments with GameGAN (CVPR 2020)
Python
222
star
17

CLD-SGM

Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
Python
194
star
18

semanticGAN_code

Official repo for SemanticGAN https://nv-tlabs.github.io/semanticGAN/
Python
180
star
19

meta-sim

Meta-Sim: Learning to Generate Synthetic Datasets (ICCV 2019)
Python
171
star
20

DefTet

Learning Deformable Tetrahedral Meshes for 3D Reconstruction (NeurIPS 2020)
Cuda
169
star
21

PADL

105
star
22

STRIVE

Code for CVPR 2022 paper "Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior"
Python
104
star
23

DriveGAN_code

Code release for DriveGAN (CVPR 2021)
CSS
93
star
24

3DiffTection

92
star
25

GENIE

GENIE: Higher-Order Denoising Diffusion Solvers
Python
88
star
26

bigdatasetgan_code

project page: https://nv-tlabs.github.io/big-datasetgan/
Python
87
star
27

stmc

Implementation of "Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation" from CVPR Workshop on Human Motion Generation 2024.
Python
77
star
28

DPDM

Differentially Private Diffusion Models
Python
76
star
29

AUV-NET

Python
75
star
30

DIB-R-Single-Image-3D-Reconstruction

Python
73
star
31

trace

Official implementation of TRACE, the TRAjectory Diffusion Model for Controllable PEdestrians, from the CVPR 2023 paper: "Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion".
Python
68
star
32

pacer

Official implementation of PACER, Pedestrian Animation ControllER, of CVPR 2023 paper: "Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion".
Python
57
star
33

planning-centric-metrics

Learning to Evaluate Perception Models Using Planner-Centric Metrics
Python
52
star
34

DiffusionTexturePainting

[SIGGRAPH 2024] Diffusion Texture Painting
Python
51
star
35

editGAN

43
star
36

meta-sim-structure

Meta-Sim2: Unsupervised Learning of Scene Structure for Synthetic Data Generation (ECCV 2020)
31
star
37

GANverse3D

27
star
38

gameGAN

Project page for GameGAN
CSS
26
star
39

VideoLDM

HTML
24
star
40

brushstroke_engine

Code accompanying Neural Brushstroke Engine paper, SIGGRAPH Asia 2022
Jupyter Notebook
23
star
41

3DStyleNet

18
star
42

nv-tlabs.github.io

NVIDIA Toronto AI Lab public website
HTML
16
star
43

fDAL

Python
14
star
44

MvDeCor

Python
13
star
45

semanticGAN

https://nv-tlabs.github.io/semanticGAN/
13
star
46

compact-ngp

13
star
47

fed-sim

Federated Simulation for Medical Imaging (MICCAI2020)
11
star
48

DP-Sinkhorn_code

Python
11
star
49

DMTet

HTML
10
star
50

big-datasetgan

https://nv-tlabs.github.io/big-datasetgan/
HTML
9
star
51

datasetGAN

8
star
52

fegr

HTML
8
star
53

NTG

NTG - Neural Turtle Graphics for Modeling City Road Layouts (ICCV 2019)
8
star
54

inverse-rendering-3d-lighting

Project page for "Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting" (ICCV 2021)
7
star
55

flexicubes_website

5
star
56

tesmo

Official implementation of TeSMo, a method for text-controlled scene-aware motion generation, from the ECCV 2024 paper: "Generating Human Interaction Motions in Scenes with Text Control".
5
star
57

nkf

Project page of Neural Fields as Learnable Kernels for 3D Reconstruction.
HTML
4
star
58

XDGAN

XDGAN: Multi-Modal 3D Shape Generation in 2D Space
HTML
4
star
59

DriveGAN

CSS
3
star
60

physics-pose-estimation-project-page

HTML
3
star
61

outdoor-ar

HTML
3
star
62

hipnet

CSS
3
star
63

simulation-strategies

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation
2
star
64

equivariant

CSS
2
star
65

estimatingrequirements

Project page for the paper "How Much More Data Do I Need? Estimating Requirements For Downstream Tasks".
HTML
2
star
66

adaptive-shells-website

HTML
2
star
67

LearnOptimizeCollect

Project page for the paper "Optimizing Data Collection In Machine Learning"
HTML
1
star
68

DP-Sinkhorn

Project page for DP-Sinkhorn (Neurips 2021)
HTML
1
star
69

PMGAN

CSS
1
star
70

hugo-backend

hugo backend for the main page
Shell
1
star
71

lip-mlp

HTML
1
star
72

unicon

HTML
1
star
73

DIBRPlus

1
star