• Stars
    star
    237
  • Rank 169,885 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created about 5 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The data skeleton from "3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera" http://3dscenegraph.stanford.edu

3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera

Overview

The 3D Scene Graph provides semantic data for models in the Gibson environment [1] that corresponds to the structure proposed in 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera. The semantic information for models in the tiny Gibson split is verified via crowdsourcing and contains all 3D Scene Graph attributes. For these models we provide both the automated and verified outputs. For the rest of them, semantic information is the output of automated modules and does not include modalities that depend solely on manual input (e.g., object materials and textures). You can learn more about 3D Scene Graph and interact with the semantic data here: http://3dscenegraph.stanford.edu

Download

You can download the 3D Scene Graph data from the link below. The link will first take you to a license agreement, and then to the data. The data per model contains only semantics and is provided in the compressed .npz format. To download the raw data visit the Gibson Environment's database and agree to their terms of use. A loading function that returns the data in the 3D Scene graph structure is included in the 'tools/' folder. Semantics per model correspond to the mesh.obj 3D meshes and the pano/rgb panoramas of the Gibson database. To learn more about the type of semantics included in 3D Scene Graph, see Dataset Structure.

[ Download 3D Scene Graph ]

Data Note: Our current release includes the tiny and medium Gibson splits. The rest of the models will follow shortly.

License Note: The dataset license is included in the above link. The license in this repository covers only the provided software. Note that it allows only non-commercial research use.

Citations

If you use this dataset please cite:

@InProceedings{armeni_iccv19,
	title ={3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera},
	author = {Iro Armeni and Zhi-Yang He and JunYoung Gwak and Amir R. Zamir and Martin Fischer and Jitendra Malik and Silvio Savarese},
	booktitle = {Proceedings of the IEEE International Conference on Computer Vision},
	year = {2019}
}

and if you use the raw data from Gibson Database please cite:

@inproceedings{xiazamirhe2018gibsonenv,
  title={Gibson env: real-world perception for embodied agents},
  author={Xia, Fei and R. Zamir, Amir and He, Zhi-Yang and Sax, Alexander and Malik, Jitendra and Savarese, Silvio},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on},
  year={2018},
  organization={IEEE}
}

Dataset Structure

The 3D Scene Graph is composed of 4 layers: building, room, object, and camera. Below is a description of the semantic information per layer.

Building

floor_area       : 2D floor area (in sq.meters)
function         : function of building
gibson_split     : Gibson split (tiny, medium, large)
id               : unique building id
name             : name of gibson model
num_cameras      : number of panoramic cameras in the model
num_floors       : number of floors in the building
num_objects      : number of objects in the building
num_rooms        : number of rooms in the building
reference_point  : building reference point
size             : 3D Size of building (XYZ, in meters)
volume           : 3D volume of building computed from 3D convex hull (in cubic meters)
voxel_size       : size of voxel (in meters)
voxel_centers    : 3D coordinates of voxel centers (Nx3)
voxel_resolution : Number of voxels per axis (k x l x m)
        
room             : 3D Scene Gaph layer for rooms
object           : 3D Scene Gaph layer for objects
camera           : 3D Scene Gaph layer for cameras

Room

floor_area         : 2D floor area (in sq.meters)
floor_number       : index of floor that contains the space
id                 : unique space id per building
location           : 3D coordinates of room center's location
inst_segmentation  : building face inidices that correspond to this room (face indices correspond to the raw *mesh.obj* provided in Gibson database)
scene_category     : function of this room
size               : 3D Size of room (XYZ, in meters)
voxel_occupancy    : building's voxel indices that correspond to this room (the voxel grid is defined by the building attributes *voxel_size*, *voxel_centers*, and *voxel_resolution*)
volume             : 3D volume of room computed from 3D convex hull (in cubic meters)
parent_building    : parent building that contains this room

Object

action_affordance  : list of possible actions
floor_area         : 2D floor area in sq.meters
surface_coverage   : total surface coverage in sq.meters
class_*            : object label
id                 : unique object id per building
location           : 3D coordinates of object center's location
material**         : list of main object materials 
size               : 3D Size of object (XYZ, in meters)
inst_segmentation  : building face inidices that correspond to this object (face indices correspond to the raw *mesh.obj* provided in Gibson database)
tactile_texture*** : main tactile texture (can be None)
visual_texture***  : main visible texture (can be None)
volume             : 3D volume of object computed from 3D convex hull (cubic meters)
voxel_occupancy    : building's voxel indices that correspond to this object (the voxel grid is defined by the building attributes *voxel_size*, *voxel_centers*, and *voxel_resolution*)
parent_room        : parent room that contains this object

Camera

name        : name of camera
id          : unique camera id
FOV         : camera field of view
location    : 3D location of camera in the model
rotation    : rotation of camera (quaternion)
modality    : camera modality (e.g., RGB, grayscale, depth, etc.)
resolution  : camera resolution
parent_room : parent room that contains this camera

Tools & Dependencies

We provide a loading function in tools/load.py, which requires Python 3.5 and the packages: trimesh, PIL. You can run this function with the tools/load.sh script - remember to change the system paths to match your configuration where applicable. In the tools folder there is the palette.txt file that contains a list of distinct RGB colors used for visualization purposes, and the dictionaries.csv file that contains a list of the category subsets of each database we use that are present in the dataset (e.g., the object classes from COCO present in the tiny Gibson models, etc.).

Automatic Labeling & 3D Scene Graph Generation

The automatic labeling and 3D Scene Graph generation pipeline is included in the source folder. The code has been tested with Python 3.6.8. All required dependencies can be found in requirements.txt. Install them by:

pip install -r $3DSceneGraph/requirements.txt

Inside source there are three folders, which correspond to the three main steps of the method:

1. Framing

First sample rectilinear frames on the equirectangular images (pano2rectilinear) and, after inferring the instance segmentations for each of this frames with the method of your choice, use pano_aggregation to aggregate the predictions on the equirectangular image. Each folder contains a shell script that you can run to process each step. The file detections_format.txt contains a description of the format of the output file of the instance segmentation.

2. Multiview Consistency

This step aggregates all panorama instance segmentations on the 3D mesh (multiview_consistency). Run the included shell script to start the process.

3. 3D Scene Graph Generation

Once the previous steps are finalized, this step will compute attributes and relationships, essentially building the 3D Scene Graph. Certain attributes are not computed analytically, and are provided as input to this step in the form of .csv files. You can ommit this if you do not have the ability to compute them otherwise. These are: object material, object texture, room scene_category, room inst_segmentation, room floor_number, building gibson_split, building function, and building num_floors. Included are examples of the specific file formats for the tiny Gibson split (model_data.csv, object_data.csv).

References

[1] Xia, Fei, et al. "Gibson env: Real-world perception for embodied agents." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. [2] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." European conference on computer vision. Springer, Cham, 2014. [3] Bell, Sean, et al. "Material recognition in the wild with the materials in context database." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. [4] Cimpoi, Mircea, et al. "Describing textures in the wild." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014.

More Repositories

1

GibsonEnv

Gibson Environments: Real-World Perception for Embodied Agents
C
864
star
2

taskonomy

Taskonomy: Disentangling Task Transfer Learning [Best Paper, CVPR2018]
Python
845
star
3

cs131_notes

Class notes for CS 131.
TeX
736
star
4

iGibson

A Simulation Environment to train Robots in Large Realistic Interactive Scenes
Python
656
star
5

CS131_release

Released assignments for the Stanford's CS131 course on Computer Vision.
Jupyter Notebook
454
star
6

OmniGibson

OmniGibson: a platform for accelerating Embodied AI research built upon NVIDIA's Omniverse engine. Join our Discord for support: https://discord.gg/bccR5vGFEx
Python
425
star
7

ReferringRelationships

Python
260
star
8

JRMOT_ROS

Source code for JRMOT: A Real-Time 3D Multi-Object Tracker and a New Large-Scale Dataset
Python
145
star
9

RubiksNet

Official repo for ECCV 2020 paper - RubiksNet: Learnable 3D-Shift for Efficient Video Action Recognition
Python
99
star
10

feedback-networks

The repo of Feedback Networks, CVPR17
Lua
89
star
11

ntp

Neural Task Programming
81
star
12

STR-PIP

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Python
74
star
13

bddl

Jupyter Notebook
67
star
14

robovat

RoboVat: A unified toolkit for simulated and real-world robotic task environments.
Python
67
star
15

iGibsonChallenge2021

Python
55
star
16

behavior

Code to evaluate a solution in the BEHAVIOR benchmark: starter code, baselines, submodules to iGibson and BDDL repos
Python
52
star
17

atp-video-language

Official repo for CVPR 2022 (Oral) paper: Revisiting the "Video" in Video-Language Understanding. Contains code for the Atemporal Probe (ATP).
Python
47
star
18

GibsonSim2RealChallenge

GibsonSim2RealChallenge @ CVPR2020
Python
35
star
19

moma

A dataset for multi-object multi-actor activity parsing
Jupyter Notebook
34
star
20

NTP-vat-release

The PyBullet wrapper (Vat) for Neural Task Programming
Python
34
star
21

mini_behavior

MiniGrid Implementation of BEHAVIOR Tasks
Python
28
star
22

BehaviorChallenge2021

Python
25
star
23

HMS

The repository of the code base of "Multi-Layer Semantic and Geometric Modeling with Neural Message Passing in 3D Scene Graphs for Hierarchical Mechanical Search"
Python
25
star
24

ac-teach

Code for the CoRL 2019 paper AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
Python
24
star
25

STGraph

Codebase for CVPR 2020 paper "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation"
22
star
26

cavin

Python
20
star
27

alignment

ELIGN: Expectation Alignment as a Multi-agent Intrinsic Reward
Python
19
star
28

Sonicverse

HTML
17
star
29

Gym

Custom version of OpenAI Gym
Python
14
star
30

causal_induction

Codebase for "Causal Induction from Visual Observations for Goal-Directed Tasks"
Python
12
star
31

keto

Python
12
star
32

Lasersuite

Forked robosuite for LASER project
Python
11
star
33

perls2

PErception and Robotic Learning System v2
Python
11
star
34

STIP

Python
10
star
35

behavioral_navigation_nlp

Code for translating navigation instructions in natural language to a high-level plan for behavioral navigation for robot navigation
Python
9
star
36

bullet3

C++
8
star
37

arxivbot

Python
8
star
38

egl_probe

A helpful module for listing available GPUs for EGL rendering.
C
6
star
39

ssai

Socially Situated AI
4
star
40

ig_navigation

Python
4
star
41

omnigibson-eccv-tutorial

Jupyter Notebook
4
star
42

RL-Pseudocode

AppleScript
4
star
43

ARPL

Adversarially Robust Policy Learning
Python
4
star
44

sail-blog-new-post

The repository for making new post submissions to the SAIL Blog
HTML
3
star
45

behavior-website-old

HTML
2
star
46

behavior-baselines

Python
2
star
47

behavior-website

SCSS
1
star
48

iris

IRIS: Implicit Reinforcement without Interaction at Scale for Control from Large-Scale Robot Manipulation Datasets
1
star
49

bullet3_ik

Pybullet frozen at version 1.9.5 - purely for using its IK implementation.
C++
1
star