Synthetic for Computer Vision
This is a repo for tracking the progress of using synthetic images for computer vision research. If you found any important work is missing or information is not up-to-date, please edit this file directly and make a pull request. Each publication is tagged with a keyword to make it easier to search.
If you find anything missing from this page, please edit this README.md
file to add it. When adding a new item, you can simply follow the format of existing items. How this document is structured is documented in contribute.md
.
How to use: Click publication to jump to the paper title, detailed information such as code and project page will be provided together with pdf file.**
Synthetic image dataset
- SunCG (Princeton)
- Minos
- House3d (Facebook)
- Procedural Human Action Videos (PHAV)
- SURREAL
- Virtual KITTI
- Synthia
- Sintel, A synthetic dataset for optical flow
- SceneFlow
- 4D Light Fields
- ICL-NUIM dataset
- Driving in the Matrix
- Playing for Benchmarks
3D Model Repository
Realistic 3D models are critical for creating realistic and diverse virtual worlds. Here are research efforts for creating 3D model repositories.
Tools
- AIPlayground: UE4 Based Data Ablation tool, see project page
- AirSim (Microsoft)
- CARLA (Intel)
- Unity ML agents
- Render SMPL human bodies on Blender, see CVPR2017
- Render for CNN, based on Blender, see ICCV2015
- UETorch, based on UE4, see ICML2016
- UnrealCV, based on UE4, see ArXiv
- VizDoom, based on Doom, see ArXiv
- OpenAI Universe, see project page
- Blender addon for 4D light field rendering, see project page
- Event-Camera Dataset and Simulator see project page
- NVIDIA Deep learning Dataset Synthesizer (NDDS)
Resources
ECCV 2016 Workshop Virtual/Augmented Reality for Visual Artificial Intelligence (VARVAI) workshop
ICCV 2017 Workshop Role of Simulation in Computer Vision
CVPR 2017 Workshop THOR Challenge
See also: http://riemenschneider.hayko.at/vision/dataset/index.php?filter=+synthetic
Misc.
- RealismCNN github
- Abnormality Detection in Images(http://paul.rutgers.edu/~babaks/abnormality_detection.html)
Reference
2020
- Mousavi, Mehdi and Khanal, Aashis and Estrada, Rolando. "AI Playground: Unreal Engine-based Data Ablation Tool for Deep Learning" International Symposium on Visual Computing (ISVC), 2020. (pdf) (project)
2017
(Total=12)
-
Adversarially Tuned Scene Generation (pdf)
-
UE4Sim: A Photo-Realistic Simulator for Computer Vision Applications (pdf) (project)
- Playing for Benchmarks (pdf)
- A Self-supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation (code) (pdf) (project)
- Procedural Generation of Videos to Train Deep Action Recognition Networks (pdf) (project) (citation:8)
-
Learning from Synthetic Humans (code) (pdf) (project) tag: synthetic human
-
Configurable, Photorealistic Image Rendering and Ground Truth Synthesis by Sampling Stochastic Grammars Representing Indoor Scenes
- Tobin, Josh, et al. "Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World." arXiv preprint arXiv:1703.06907 (2017). tag: domain (pdf)
- M. Johnson-Roberson, C. Barto, R. Mehta, S. N. Sridhar, Karl Rosaen,and R. Vasudevan, โDriving in the matrix: Can virtual worlds replace human-generated annotations for real world tasks?,โ in IEEE International Conference on Robotics and Automation, pp. 1โ8, 2017. (code) (pdf) (project) (citation:3)
- Zheng Z, Zheng L, Yang Y. "Unlabeled samples generated by gan improve the person re-identification baseline in vitro" in Proceedings of IEEE International Conference on Computer Vision, 2017. (code) (pdf) (citation:48) tag: generated images by GAN
2016
(Total=17)
-
Sadeghi, Fereshteh, and Sergey Levine. "rl: Real single-image flight without a single real image. arXiv preprint." arXiv preprint arXiv:1611.04201 12 (2016). tag: rl
-
Johnson, Justin, et al. "CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning." arXiv preprint arXiv:1612.06890 (2016). (pdf)
-
McCormac, John, et al. "SceneNet RGB-D: 5M Photorealistic Images of Synthetic Indoor Trajectories with Ground Truth." arXiv preprint arXiv:1612.05079 (2016).
-
de Souza, Cรฉsar Roberto, et al. "Procedural Generation of Videos to Train Deep Action Recognition Networks." arXiv preprint arXiv:1612.00881 (2016). (pdf) (project) tag: synthetic human
-
Synnaeve, Gabriel, et al. "TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games." arXiv preprint arXiv:1611.00625 (2016). (pdf) (code)
-
Lin, Jenny, et al. "A virtual reality platform for dynamic human-scene interaction." SIGGRAPH ASIA 2016 Virtual Reality meets Physical Reality: Modelling and Simulating Virtual Humans and Environments. ACM, 2016. (pdf) (project)
-
Mahendran, A., et al. "ResearchDoom and CocoDoom: Learning Computer Vision with Games." arXiv preprint arXiv:1610.02431 (2016). (pdf) (project)
- The SYNTHIA dataset: A large collection of synthetic images for semantic segmentation of urban scenes. 2016 (pdf) (project) (citation:4)
-
Virtual Worlds as Proxy for Multi-Object Tracking Analysis. 2016
(pdf) (project) (citation:5) -
Playing for data: Ground truth from computer games. 2016
(pdf) (citation:1) -
Play and Learn: Using Video Games to Train Computer Vision Models. 2016
(pdf) (citation:1) -
ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning. 2016
(code) (pdf) (project) (citation:4)
- A large dataset of object scans. 2016
(pdf) (project) (citation:6)
-
Learning Physical Intuition of Block Towers by Example 2016
(code) (pdf) (citation:12) -
Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning 2016
(pdf)
- A Dataset and Evaluation Methodology for Depth Estimation on 4D Light Fields. ACCV 2016
(code) (pdf) (project) (citation)
2015
(Total=3)
- A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. 2015
(pdf) (citation:9)
- Render for cnn: Viewpoint estimation in images using cnns trained with rendered 3d model views. 2015
(code) (pdf) (citation:33)
- Shapenet: An information-rich 3d model repository. 2015
(pdf) (project) (citation:27)
2014
(Total=2)
- Virtual and real world adaptation for pedestrian detection. 2014
(pdf) (citation:46)
- Seeing 3d chairs: exemplar part-based 2d-3d alignment using a large dataset of cad models. 2014
(code) (pdf) (project) (citation:110)
- Handa, Ankur, Thomas Whelan, John McDonald, and Andrew J. Davison. "A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM." In Robotics and automation (ICRA), 2014 IEEE international conference on, pp. 1524-1531. IEEE, 2014. (project)
2013
(Total=1)
- Detailed 3d representations for object recognition and modeling. 2013
(pdf) (citation:67)
2012
(Total=1)
- A naturalistic open source movie for optical flow evaluation. 2012
(pdf) (project) (citation:227)
2010
(Total=1)
- Learning appearance in virtual scenarios for pedestrian detection. 2010
(pdf) (citation:79)
2007
(Total=1)
- Ovvv: Using virtual worlds to design and evaluate surveillance systems. 2007
(pdf) (citation:58)