3D-Shape-Analysis-Paper-List
A list of papers, libraries and datasets I recently read is collected for anyone who shows interest at
- 3D Detection & Segmentation
- Shape Representation
- Shape & Scene Completion
- Shape Reconstruction & Generation
- 3D Scene Understanding
- 3D Scene Reconstruction & Generation
- NeRF
- About Human Body
- General Methods
- Others (inc. Networks in Classification, Matching, Registration, Alignment, Depth, Normal, Pose, Keypoints, etc.)
- Survey, Resources and Tools
Statistics:
3D Detection & Segmentation
- [Arxiv] Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding [Project]
- [Arxiv] RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding [Project]
- [CVPR2023] EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision [Project]
- [CVPR2023] PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection [Project]
- [Arxiv] Mask3D for 3D Semantic Instance Segmentation [github]
- [ECCV2022] ObjectBox: From Centers to Boxes for Anchor-Free Object Detection [github]
- [Arxiv] Masked Autoencoders for Self-Supervised Learning on Automotive Point Clouds
- [CVPR2022] HyperDet3D: Learning a Scene-conditioned 3D Object Detector
- [Arxiv] AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection
Before 2022
- [AAAI2022] AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds
- [AAAI2022] Static-Dynamic Co-Teaching for Class-Incremental 3D Object Detection
- [NeurIPS2021] Revisiting 3D Object Detection From an Egocentric Perspective
- [Arxiv] Embracing Single Stride 3D Object Detector with Sparse Transformer [github]
- [AAAI2022] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
- [Arxiv] 3D-VField: Learning to Adversarially Deform Point Clouds for Robust 3D Object Detection
- [Arxiv] Fast Point Transformer
- [3DV2021] Open-set 3D Object Detection
- [Arxiv] FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection [Project]
- [TPAMI2021] Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining
- [Arxiv] Online Adaptation for Implicit Object Tracking and Shape Reconstruction in the Wild
- [Arxiv] RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation [github]
- [Arxiv] SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking [github]
- [NeurIPS2021] Multimodal Virtual Point 3D Detection [Project]
- [BMVC2021] 3D Object Tracking with Transformer [github]
- [3DV2021] Learning 3D Semantic Segmentation with only 2D Image Supervision
- [3DV2021] NeuralDiff: Segmenting 3D objects that move in egocentric videos [Project]
- [BMVC2021] FAST3D: Flow-Aware Self-Training for 3D Object Detectors
- [ICCV2021] Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation
- [CORL2021] DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries [github]
- [NeurIPS2021] Object DGCNN: 3D Object Detection using Dynamic Graphs [github]
- [Arxiv] Improved Pillar with Fine-grained Feature for 3D Object Detection
- [Arxiv] 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation
- [ICCVW2021] MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation
- [Arxiv] GSIP: Green Semantic Segmentation of Large-Scale Indoor Point Clouds
- [Arxiv] Pix2seq: A Language Modeling Framework for Object Detection
- [Arxiv] MVM3Det: A Novel Method for Multi-view Monocular 3D Detection
- [ICCV2021] NEAT: Neural Attention Fields for End-to-End Autonomous Driving [github]
- [ICCV2021] Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection
- [ICCV2021] 4D-Net for Learned Multi-Modal Alignment
- [ICCV2021] Active Learning for Deep Object Detection via Probabilistic Modeling [github]
- [ICCV2021] An End-to-End Transformer Model for 3D Object Detection [Project]
- [ICCV2021] Improving 3D Object Detection with Channel-wise Transformer
- [ICCV2021] Voxel Transformer for 3D Object Detection
- [CVPR2021] To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels
- [Arxiv] M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
- [ICCV2021] Exploring Simple 3D Multi-Object Tracking for Autonomous Driving
- [ICCV2021] LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector
- [ICCV2021] Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks [github]
- [ICCV2021] RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
- [ICCV2021] Is Pseudo-Lidar needed for Monocular 3D Object detection?
- [IROS2021] PTT: Point-Track-Transformer Module for 3D Single Object Tracking in Point Clouds [github]
- [ICCV2021] Oriented R-CNN for Object Detection [github]
- [ICCV2021] Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds [github]
- [IROS2021] Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving
- [ACMMM2021] From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to-Point Decoder [github]
- [ICCV2021] DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation
- [ICCV2021] Hierarchical Aggregation for 3D Instance Segmentation [github]
- [Arxiv] Investigating Attention Mechanism in 3D Point Cloud Object Detection [pytorch]
- [ICCV2021] VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation [pytorch]
- [ICCV2021] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
- [Arxiv] Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth
- [Arxiv] DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization
- [ICCV2021] ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation
- [ICCV2021] Rank & Sort Loss for Object Detection and Instance Segmentation [pytorch]
- [Arxiv] Multi-Modality Task Cascade for 3D Object Detection [github]
- [ACMMM2021] Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting
- [Arxiv] Monocular 3D Object Detection: An Extrinsic Parameter Free Approach
- [Arxiv] Real-time 3D Object Detection using Feature Map Flow [pytorch]
- [Arxiv] To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
- [CVPR2021] RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
- [Arxiv] Sparse PointPillars: Exploiting Sparsity in Birds-Eye-View Object Detection
- [Arxiv] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection [Project]
- [CVPR2021] 3D Spatial Recognition without Spatially Labeled 3D [Project]
- [Arxiv] Lite-FPN for Keypoint-based Monocular 3D Object Detection
- [TPAMI] MonoGRNet: A General Framework for Monocular 3D Object Detection
- [Arxiv] Lidar Point Cloud Guided Monocular 3D Object Detection
- [Arxiv] Geometry-aware data augmentation for monocular 3D object detection
- [Arxiv] OCM3D: Object-Centric Monocular 3D Object Detection
- [CVPR2021] Objects are Different: Flexible Monocular 3D Object Detection [github]
- [CVPR2021] HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection
- [Arxiv] Group-Free 3D Object Detection via Transformers [pytorch]
- [CVPR2021] GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection [pytorch]
- [CVPR2021] Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [pytorch]
- [CVPR2021] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection [github]
- [CVPR2021] Delving into Localization Errors for Monocular 3D Object Detection [github]
- [CVPR2021] 3D-MAN: 3D Multi-frame Attention Network for Object Detection
- [CVPR2021] LiDAR R-CNN: An Efficient and Universal 3D Object Detector [github]
- [CVPR2021] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [pytorch]
- [CVPR2021] M3DSSD: Monocular 3D Single Stage Object Detector
- [CVPR2021] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
- [Arxiv] SparsePoint: Fully End-to-End Sparse 3D Object Detector
- [Arxiv] RangeDet:In Defense of Range View for LiDAR-based 3D Object Detection
- [ICRA2021] YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection [github]
- [CVPR2021] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection [github]
- [Arxiv] Offboard 3D Object Detection from Point Cloud Sequences
- [CVPR2021] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution [github]
- [Arxiv] Pseudo-labeling for Scalable 3D Object Detection
- [Arxiv] DPointNet: A Density-Oriented PointNet for 3D Object Detection in Point Clouds
- [Arxiv] PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection [pytorch]
- [Arxiv] Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss
- [Arxiv] CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection
- [Arxiv] Self-Attention Based Context-Aware 3D Object Detection [pytorch]
- [Arxiv] Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection
Before 2021
- [Arxiv] It’s All Around You: Range-Guided Cylindrical Network for 3D Object Detection
- [Arxiv] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [Project]
- [Arxiv] Demystifying Pseudo-LiDAR for Monocular 3D Object Detection
- [3DV2020] PanoNet3D: Combining Semantic and Geometric Understanding for LiDAR Point Cloud Detection
- [AAAI2021] PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection
- [Arxiv] SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation
- [Arxiv] 3D Object Detection with Pointformer
- [WACV2021] CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection [pytorch]
- [Arxiv] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation [pytorch]
- [Arxiv] Learning to Predict the 3D Layout of a Scene
- [Arxiv] Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes [Project]
- [Arxiv] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution
- [Arxiv] Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving
- [NeurIPS2020] Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization
- [NeurIPS2020] Group Contextual Encoding for 3D Point Clouds [pytorch]
- [Arxiv] 3D Object Recognition By Corresponding and Quantizing Neural 3D Scene Representations [Project]
- [Arxiv] A Density-Aware PointRCNN for 3D Objection Detection in Point Clouds
- [Arxiv] Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training
- [ECCV2020] Reinforced Axial Refinement Network for Monocular 3D Object Detection
- [Arxiv] RUHSNet: 3D Object Detection Using Lidar Data in Real Time [pytorch]
- [IROS2020] 3D Multi-Object Tracking: A Baseline and New Evaluation Metrics [Project][Code]
- [ECCV2020] Virtual Multi-view Fusion for 3D Semantic Segmentation
- [ACMMM2020] Weakly Supervised 3D Object Detection from Point Clouds
- [ECCV2020] Weakly Supervised 3D Object Detection from Lidar Point Cloud [pytorch]
- [ECCV2020] Kinematic 3D Object Detection in Monocular Video
- [IROS2020] Object-Aware Centroid Voting for Monocular 3D Object Detection
- [ECCV2020] Pillar-based Object Detection for Autonomous Driving
- [Arxiv] Local Grid Rendering Networks for 3D Object Detection in Point Clouds
- [Arxiv] Learning to Detect 3D Objects from Point Clouds in Real Time
- [Arxiv] SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds
- [CVPR2020] PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
- [CVPR2020] FroDO: From Detections to 3D Objects
- [CVPR2020] Physically Realizable Adversarial Examples for LiDAR Object Detection
- [CVPR2020] Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection
- [CVPR2020] End-to-end 3D Point Cloud Instance Segmentation without Detection
- [CVPR2020] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
- [CVPR2020] Structure Aware Single-stage 3D Object Detection from Point Cloud
- [CVPR2020] Learning Depth-Guided Convolutions for Monocular 3D Object Detection [pytorch]
🔥 - [CVPR2020] What You See is What You Get: Exploiting Visibility for 3D Object Detection
- [CVPR2020] Density Based Clustering for 3D Object Detection in Point Clouds
- [CVPR2020] Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation
- [CVPR2020] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
- [CVPR2020] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
- [CVPR2020] MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
- [CVPR2020] PointPainting: Sequential Fusion for 3D Object Detection
- [CVPR2020] Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
- [CVPR2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [tensorflow]
- [CVPR2020] Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
- [CVPR2020] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
- [CVPR2020] A Hierarchical Graph Network for 3D Object Detection on Point Clouds
- [Arxiv] H3DNet: 3D Object Detection Using Hybrid Geometric Primitives [pytorch]
- [CVPR2020] P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds
- [Arxiv] 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection
- [CVPR2020] Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking
- [CVPR2020] Learning to Evaluate Perception Models Using Planner-Centric Metrics
- [CVPR2020] Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation [pytorch]
- [Arxiv] SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds [github]
- [CVPR2020] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [github]
- [Arxiv] Finding Your (3D) Center: 3D Object Detection Using a Learned Loss
- [CVPR2020] PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
- [CVPR2020] 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segm
- [CVPR2020] Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
- [CVPR2020] OccuSeg: Occupancy-aware 3D Instance Segmentation
- [CVPR2020] Learning to Segment 3D Point Clouds in 2D Image Space
- [CVPR2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [tensorflow]
- [AAAI2020] ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection
- [Arxiv] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
- [Arxiv] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
- [Arxiv] SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
- [Arxiv] 3DSSD: Point-based 3D Single Stage Object Detector
- [Arxiv] Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation
- [CVPR2020] ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
- [Arxiv] A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full 6D Pose Estimators
- [Arxiv] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
- [Arxiv] Objects as Points [github]
⭐ 🔥 - [Arxiv] RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving [github]
- [CVPR2020] DSGN: Deep Stereo Geometry Network for 3D Object Detection [github]
- [Arxiv] Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation
- [Arxiv] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
- [Arxiv] Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
- [CVPR2020] SESS: Self-Ensembling Semi-Supervised 3D Object Detection
- [NeurIPS2019] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
- [NeurIPS2019] Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds
- [ICCV2019] Deep Hough Voting for 3D Object Detection in Point Clouds
- [AAAI2020] JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds
- [ICCV2019] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [pytorch]
- [ICCV2019] 3D Instance Segmentation via Multi-Task Metric Learning
- [Arxiv] Single-Stage Monocular 3D Object Detection with Virtual Cameras
- [Arxiv] Depth Completion via Deep Basis Fitting
- [Arxiv] Relation Graph Network for 3D Object Detection in Point Clouds
- [CVPR2019] 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans [pytorch]
🔥 - [ICCV2019] Rescan: Inductive Instance Segmentation for Indoor RGBD Scans [C++]
- [ICCV2019] Transferable Semi-Supervised 3D Object Detection From RGB-D Data
- [ICCV2019] STD: Sparse-to-Dense 3D Object Detector for Point Cloud
- [CVPR2019] PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud [pytorch]
- [Arxiv] Fast Point R-CNN
- [Arxiv] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection [pytorch]
🔥 - [ECCV2018] 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation [pytorch]
🔥
Shape Representation
- [CVPR2023] Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning [github]
- [Arxiv] Neural Vector Fields: Implicit Representation by Explicit Learning
- [ECCV2022] NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing [Project]
- [Arxiv] Masked Autoencoders in 3D Point Cloud Representation Learning
- [Arxiv] NeuralODF: Learning Omnidirectional Distance Fields for 3D Shape Representation
- [Siggraph2022] Learning Smooth Neural Functions via Lipschitz Regularization [Project]
- [Siggraph2022] Dual Octree Graph Networks for Learning Adaptive Volumetric Shape Representations [Project]
- [Arxiv] A Level Set Theory for Neural Implicit Evolution under Explicit Flows
- [CVPR2022] GIFS: Neural Implicit Function for General Shape Representation [Project]
- [Arxiv] PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
- [Arxiv] Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning
- [Arxiv] Spelunking the Deep: Guaranteed Queries for General Neural Implicit Surfaces
- [Arxiv] MINER: Multiscale Implicit Neural Representations
- [Arxiv] De-rendering 3D Objects in the Wild
- [Arxiv] Implicit Autoencoder for Point Cloud Self-supervised Representation Learning
Before 2022
- [Arxiv] End-to-End Learning of Multi-category 3D Pose and Shape Estimation
- [Arxiv] Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders
- [Arxiv] Representing 3D Shapes with Probabilistic Directed Distance Fields
- [Arxiv] Text2Mesh: Text-Driven Neural Stylization for Meshes [Project]
- [Arxiv] PointCLIP: Point Cloud Understanding by CLIP [github]
- [Arxiv] Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
- [Arxiv] Gradient-SDF: A Semi-Implicit Surface Representation for 3D Reconstruction
- [Arxiv] Intuitive Shape Editing in Latent Space
- [NeurIPS2021] Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views [github]
- [Arxiv] Neural Fields as Learnable Kernels for 3D Reconstruction
- [NeurIPS2021] OctField: Hierarchical Implicit Functions for 3D Modeling [github]
- [3DV2021] RefRec: Pseudo-labels Refinement via Shape Reconstruction for Unsupervised 3D Domain Adaptation [github]
- [3DV2021] PolyNet: Polynomial Neural Network for 3D Shape Recognition with PolyShape Representation [Project]
- [Arxiv] BACON: Band-limited Coordinate Networks for Multiscale Scene Representation [Project]
- [Arxiv] UNIST: Unpaired Neural Implicit Shape Translation Network [Project]
- [Arxiv] Representing Shape Collections with Alignment-Aware Linear Models [Project]
- [ICCV2021] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
- [Arxiv] DeepCurrents: Learning Implicit Representations of Shapes with Boundaries
- [3DV] AIR-Nets: An Attention-Based Framework for Locally Conditioned Implicit Representations [github]
- [Arxiv] HyperCube: Implicit Field Representations of Voxelized 3D Models
- [Arxiv] ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators
- [ICCV2021] Multiresolution Deep Implicit Functions for 3D Shape Representation
- [ICCV2021] Learning Canonical 3D Object Representation for Fine-Grained Recognition
- [Arxiv] Point Discriminative Learning for Unsupervised Representation Learning on 3D Point Clouds
- [Arxiv] A Deep Signed Directional Distance Function for Object Shape Representation
- [Arxiv] 3D Neural Scene Representations for Visuomotor Control [Project]
- [Arxiv] A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation [Project]
- [Arxiv] ShapeMOD: Macro Operation Discovery for 3D Shape Programs [Project]
- [Arxiv] CoCoNets: Continuous Contrastive 3D Scene Representations [Project]
- [Arxiv] DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates [Project]
Before 2021
- [CVPR2021] clDice-a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation [github]
- [CVPR2021] Point2Skeleton: Learning Skeletal Representations from Point Clouds [pytorch]
- [Arxiv] ParaNet: Deep Regular Representation for 3D Point Clouds
- [Arxiv] Geometric Adversarial Attacks and Defenses on 3D Point Clouds [tensorflow]
- [Arxiv] Learning Category-level Shape Saliency via Deep Implicit Surface Networks
- [Arxiv] pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
- [Arxiv] Deep Implicit Templates for 3D Shape Representation
- [NeurIPS2020] MetaSDF: Meta-learning Signed Distance Functions [Project]
- [Arxiv] RISA-Net: Rotation-Invariant Structure-Aware Network for Fine-Grained 3D Shape Retrieval [tensorflow]
- [Arxiv] Overfit Neural Networks as a Compact Shape Representation
- [Arxiv] DSM-Net: Disentangled Structured Mesh Net for Controllable Generation of Fine Geometry [Project]
- [Arxiv] PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
- [Arxiv] CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations
- [Arxiv] ROCNET: RECURSIVE OCTREE NETWORK FOR EFFICIENT 3D DEEP REPRESENTATION
- [ECCV2020] GeLaTO: Generative Latent Textured Objects [Project]
- [ECCV2020] Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
- [Arxiv] Neural Sparse Voxel Fields
- [CVPR2020] StructEdit: Learning Structural Shape Variations [github]
- [Arxiv] PAI-GCN: Permutable Anisotropic Graph Convolutional Networks for 3D Shape Representation Learning [github]
- [CVPR2020] Learning Generative Models of Shape Handles [Project page]
- [CVPR2020] DualSDF: Semantic Shape Manipulation using a Two-Level Representation [github]
- [CVPR2020] Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image [pytorch]
- [NeurIPS2019] Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations [pytorch]
- [Arxiv] Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions
- [Arxiv] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
- [Arxiv] Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction
- [Arxiv] SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting 1D Occupancy Segments From 2D Coordinates
- [CVPR2020] D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
- [Arxiv] Implicit Geometric Regularization for Learning Shapes
- [Arxiv] Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
- [Arxiv] Adversarial Generation of Continuous Implicit Shape Representations [pytorch]
- [Arxiv] A Novel Tree-structured Point Cloud Dataset For Skeletonization Algorithm Evaluation [dataset]
- [CVPRW2019] SkelNetOn 2019: Dataset and Challenge on Deep Learning for Geometric Shape Understanding [project]
- [Arxiv] Skeleton Extraction from 3D Point Clouds by Decomposing the Object into Parts
- [Arxiv] InSphereNet: a Concise Representation and Classification Method for 3D Object
- [Arxiv] Deep Structured Implicit Functions
- [CVIU] 3D articulated skeleton extraction using a single consumer-grade depth camera
- [ICLR2019] Point Cloud GAN [tensorflow]
- [ICCV2019] Learning Shape Templates with Structured Implicit Functions
- [ICCV2019] 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions [pytorch]
- [ICCV2019] Implicit Surface Representations as Layers in Neural Networks
- [CVPR2019] DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation [pytorch]
🔥 ⭐ - [SIGGRAPH2019] StructureNet: Hierarchical Graph Networks for 3D Shape Generation [pytorch]
- [SIGGRAPH Asia2019] LOGAN: Unpaired Shape Transform in Latent Overcomplete Space [tensorflow]
- [TOG] Voxel Cores: Efficient, robust, and provably good approximation of 3D medial axes
- [SIGGRAPH2018] P2P-NET: Bidirectional Point Displacement Net for Shape Transform [tensorflow]
- [ICML2018] Learning Representations and Generative Models for 3D Point Clouds [tensorflow]
🔥 ⭐ - [NeurIPS2018] Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning [tensorflow][project page]
⭐ 🔥 - [AAAI2018] Unsupervised Articulated Skeleton Extraction from Point Set Sequences Captured by a Single Depth Camera
- [3DV2018] Parsing Geometry Using Structure-Aware Shape Templates
- [SIGGRAPH2017] GRASS: Generative Recursive Autoencoders for Shape Structures [pytorch]
🔥 - [TOG] Erosion Thickness on Medial Axes of 3D Shapes
- [Vis Comput] Distance field guided L1-median skeleton extraction
- [CGF] Contracting Medial Surfaces Isotropically for Fast Extraction of Centred Curve Skeletons
- [CGF] Improved Use of LOP for Curve Skeleton Extraction
- [SIGGRAPH Asia2015] Deep Points Consolidation [C++ & Qt]
- [SIGGRAPH2015] Burning The Medial Axis
- [SIGGRAPH2009] Curve Skeleton Extraction from Incomplete Point Cloud [matlab]
⭐ - [TOG] SDM-NET: deep generative network for structured deformable mesh
- [TOG] Robust and Accurate Skeletal Rigging from Mesh Sequences
🔥 - [TOG] L1-medial skeleton of point cloud [C++]
🔥 - [EUROGRAPHICS2016] 3D Skeletons: A State-of-the-Art Report
🔥 - [SGP2012] Mean Curvature Skeletons [C++]
🔥 - [SMIC2010] Point Cloud Skeletons via Laplacian-Based Contraction [Matlab]
🔥
Shape & Scene Completion
- [ECCV2022] CompNVS: Novel View Synthesis with Scene Completion
- [ECCV2022] PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation [Project]
- [Arxiv] SRPCN: Structure Retrieval based Point Completion Network
- [ICRA2022] Temporal Point Cloud Completion with Pose Disturbance
- [Arxiv] Towards realistic symmetry-based completion of previously unseen point clouds [github]
Before 2022
- [AAAI2022] Not All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective
- [AAAI2022] Attention-based Transformation from Latent Features to Point Clouds
- [Arxiv] MonoScene: Monocular 3D Semantic Scene Completion [Project]
- [Arxiv] Semi-supervised Implicit Scene Completion from Sparse LiDAR [github]
- [NeurIPS2021] Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion [github]
- [Arxiv] PU-Transformer: Point Cloud Upsampling Transformer
- [BMVC2021] Self-Supervised Point Cloud Completion via Inpainting
- [IROS2021] Graph-Guided Deformation for Point Cloud Completion
- [IROS2021] Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds [github]
- [Arxiv] 3D Point Cloud Completion with Geometric-Aware Adversarial Augmentation
- [Arxiv] PC2-PU: Patch Correlation and Position Correction for Effective Point Cloud Upsampling
- [ICCV2021] Voxel-based Network for Shape Completion by Leveraging Edge Generation [github]
- [ICCV2021] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [github]
- [ICCV2021] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer [github]
- [Arxiv] CarveNet: Carving Point-Block for Complex 3D Shape Completion
- [IJCAI2021] IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement
- [CVPR2021] Point Cloud Upsampling via Disentangled Refinement [github]
- [TVCG2021] Consistent Two-Flow Network for Tele-Registration of Point Clouds [Project]
- [Arxiv] 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [Project]
- [CVPR2021] Unsupervised 3D Shape Completion through GAN Inversion [Project]
- [Arxiv] ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion
- [CVPR2021] Variational Relational Point Completion Network [Project]
- [CVPR2021] View-Guided Point Cloud Completion
- [CVPR2021] Semantic Scene Completion via Integrating Instances and Scene in-the-Loop [pytorch]
- [CVPR2021] Denoise and Contrast for Category Agnostic Shape Completion
- [CVPR2021] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding
- [CVPR2021] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
- [CVPR2021] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
- [Arxiv] VPC-Net: Completion of 3D Vehicles from MLS Point Clouds
Before 2021
- [Arxiv] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
- [Arxiv] S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds
- [Arxiv] Semantic Scene Completion using Local Deep Implicit Functions on LiDAR Data
- [Arxiv] Learning-based 3D Occupancy Prediction for Autonomous Navigation in Occluded Environments
- [Arxiv] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
- [3DV2020] SCFusion: Real-time Incremental Scene Reconstruction with Semantic Completion
- [Arxiv] Refinement of Predicted Missing Parts Enhance Point Cloud Completion [pytorch]
- [Arxiv] Unsupervised Partial Point Set Registration via Joint Shape Completion and Registration
- [Arxiv] LMSCNet: Lightweight Multiscale 3D Semantic Completion [Demo]
- [ECCV2020] SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification
- [ECCV2020] Weakly-supervised 3D Shape Completion in the Wild
- [Arxiv] Point Cloud Completion by Learning Shape Priors
- [Arxiv] KAPLAN: A 3D Point Descriptor for Shape Completion
- [Arxiv] VPC-Net: Completion of 3D Vehicles from MLS Point Clouds
- [Arxiv] SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
- [Arxiv] GRNet: Gridding Residual Network for Dense Point Cloud Completion
- [Arxiv] Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion
- [CVPR2020] Point Cloud Completion by Skip-attention Network with Hierarchical Folding
- [CVPR2020] Cascaded Refinement Network for Point Cloud Completion [github]
- [CVPR2020] Anisotropic Convolutional Networks for 3D Semantic Scene Completion [github]
- [AAAI2020] Attention-based Multi-modal Fusion Network for Semantic Scene Completion
- [CVPR2020] 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior [github]
- [ECCV2020] Multimodal Shape Completion via Conditional Generative Adversarial Networks [pytorch]
- [CVPR2020] RevealNet: Seeing Behind Objects in RGB-D Scans
- [CVPR2020] Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
- [CVPR2020] PF-Net: Point Fractal Network for 3D Point Cloud Completion
- [Arxiv] 3D Gated Recurrent Fusion for Semantic Scene Completion
- [ICCVW2019] EdgeConnect: Structure Guided Image Inpainting using Edge Prediction [pytorch]
🔥 ⭐ - [ICRA2020] Depth Based Semantic Scene Completion with Position Importance Aware Loss
- [CVPR2020] SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
- [Arxiv] PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
- [ICLR2020] Unpaired Point Cloud Completion on Real Scans using Adversarial Training [tensorflow]
- [AAAI2020] Morphing and Sampling Network for Dense Point Cloud Completion [pytorch]
- [ICCVW2019] Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion
- [ICCV2019] ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image [tensorflow]
- [ICCV2019] Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion [Caffe3D]
- [ICCV2019] Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds from Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction
- [Arxiv] EdgeNet: Semantic Scene Completion from RGB-D images
- [CVPR2019] TopNet: Structural Point Cloud Decoder [pytorch & tensorflow]
- [CVPR2019] Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image
- [CVPR2019] Leveraging Shape Completion for 3D Siamese Tracking [pytorch]
- [CVPR2019] RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion [pytorch]
- [3DV2018] PCN: Point Completion Network [tensorflow]
🔥 - [ECCV2018] Efficient Semantic Scene Completion Network with Spatial Group Convolution [pytorch]
- [CVPR2018] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [tensorflow]
🔥 ⭐ - [CVPR2018] Learning 3D Shape Completion from Laser Scan Data with Weak Supervision [torch][torch]
- [IJCV2018] Learning 3D Shape Completion under Weak Supervision [torch][torch]
- [ICCV2017] High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference
⭐ - [ICCV2017] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [torch]
🔥 ⭐ - [CVPR2017] Semantic Scene Completion from a Single Depth Image [caffe]
🔥 ⭐ - [CVPR2016] Structured Prediction of Unobserved Voxels From a Single Depth Image [resource]
⭐
Shape Reconstruction & Generation
- [Arxiv] PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion [Project]
- [Arxiv] 3D-aware Image Generation using 2D Diffusion Models [Project]
- [Arxiv] HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion [Project]
- [Arxiv] DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model [Project]
- [Arxiv] Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior [Project]
- [Arxiv] RealFusion: 360° Reconstruction of Any Object from a Single Image [Project]
- [Arxiv] 3DGen: Triplane Latent Diffusion for Textured Mesh Generation
- [Arxiv] Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation [Project]
- [CVPR2023] Controllable Mesh Generation Through Sparse Latent Point Diffusion Models [Project]
- [CVPR2023] NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images [Project]
- [ICLR2023] MeshDiffusion: Score-based Generative 3D Mesh Modeling [Project]
- [CVPR2023] PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision [Project]
- [Arxiv] Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions [Project]
- [CVPR2023] SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field [Project]
- [Arxiv] NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
- [Arxiv] Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement [Project]
- [Arxiv] 3D generation on ImageNet [Project]
- [Arxiv] Text-driven Visual Synthesis with Latent Diffusion Prior [Project]
- [Arxiv] VQ3D: Learning a 3D-Aware Generative Model on ImageNet [Project]
- [Arxiv] TEXTure: Text-Guided Texturing of 3D Shapes [Project]
- [Arxiv] LEGO-Net: Learning Regular Rearrangements of Objects in Rooms [Project]
- [Arxiv] DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis [Project]
- [Arxiv] GeoCode: Interpretable Shape Programs [Project]
- [Arxiv] Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models [Project]
- [Arxiv] Point-E: A System for Generating 3D Point Clouds from Complex Prompts [Project]
- [Arxiv] LoopDraw: a Loop-Based Autoregressive Model for Shape Synthesis and Editing
- [Arxiv] SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation [Project]
- [Arxiv] NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
- [Arxiv] Diffusion-SDF: Text-to-Shape via Voxelized Diffusion [Project]
- [Arxiv] 3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
- [Arxiv] Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation [Project]
- [Arxiv] SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction [Project]
- [Arxiv] 3D Neural Field Generation using Triplane Diffusion [Project]
- [Arxiv] Neural Volumetric Mesh Generator
- [Arxiv] Tetrahedral Diffusion Models for 3D Shape Generation
- [Arxiv] MagicPony: Learning Articulated 3D Animals in the Wild [Project]
- [Arxiv] RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation [Project]
- [Arxiv] Magic3D: High-Resolution Text-to-3D Content Creation [Project]
- [Arxiv] Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
- [NeurIPS2022] LION: Latent Point Diffusion Models for 3D Shape Generation [Project]
- [NeurIPS2022] GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [Project]
- [ECCV2022] Cross-Modal 3D Shape Generation and Manipulation [Project]
- [ECCV2022] Deforming Radiance Fields with Cages
- [NeurIPS2021] NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild [Project]
- [CVPR2022] CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation [github]
- [CVPR2022] Multi-View Mesh Reconstruction with Neural Deferred Shading [Project]
- [Arxiv] Neural Surface Reconstruction of Dynamic Scenes with Monocular RGB-D Camera [Project]
- [Arxiv] Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model
- [Arxiv] 3DILG: Irregular Latent Grids for 3D Generative Modeling [Project]
- [CVPR2022] FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction [Project]
- [CVPR2022] Topologically-Aware Deformation Fields for Single-View 3D Reconstruction [Project]
- [Arxiv] Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues [Project]
- [Arxiv] Neural Vector Fields for Surface Representation and Inference
- [CVPR2022] Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction [Project]
- [CVPR2022] BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information [Project]
- [CVPR2022] φ-SfT: Shape-from-Template with a Physics-Based Deformation Model [Project]
- [CVPR2022] OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction [Project]
- [Arxiv] Neural Dual Contouring
- [Arxiv] POCO: Point Convolution for Surface Reconstruction [Project]
- [ICCV2021] SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators [github]
Before 2022
- [Arxiv] DoodleFormer: Creative Sketch Drawing with Transformers
- [NeurIPS2021] Class-agnostic Reconstruction of Dynamic Objects from Videos [Project]
- [Arxiv] The Shape Part Slot Machine: Contact-based Reasoning for Generating 3D Shapes from Parts
- [Arxiv] MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks [github]
- [Arxiv] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers [github]
- [Arxiv] JoinABLe: Learning Bottom-up Assembly of Parametric CAD Joints
- [Arxiv] Image Based Reconstruction of Liquids from 2D Surface Detections
- [Arxiv] TaylorImNet for Fast 3D Shape Reconstruction Based on Implicit Surface Function
- [NeurIPS2021] Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis [Project]
- [ICML2021] Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces [tensorflow]
- [Arxiv] StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation [Project]
- [3DV2021] High Fidelity 3D Reconstructions with Limited Physical Views [Project]
- [3DV2021] Multi-Category Mesh Reconstruction From Image Collections [github]
- [Arxiv] Style Agnostic 3D Reconstruction via Adversarial Style Transfer [https://github.com/Felix-Petersen/style-agnostic-3d-reconstruction]
- [Arxiv] BANMo: Building Animatable 3D Neural Models from Many Casual Videos [Project]
- [Arxiv] EditVAE: Unsupervised Part-Aware Controllable 3D Point Cloud Shape Generation
- [Arxiv] Differentiable Stereopsis: Meshes from multiple views using differentiable rendering [Project]
- [ICCV2021] Neural Strokes: Stylized Line Drawing of 3D Shapes
- [ACMMM2021] Single Image 3D Object Estimation with Primitive Graph Networks
- [Arxiv] Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
- [Arxiv] ABO: Dataset and Benchmarks for Real-World 3D Object Understanding [Project]
- [ICCV2021] Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction [github]
- [Arxiv] Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images
- [ICCV2021] Learning Signed Distance Field for Multi-view Surface Reconstruction
- [Arxiv] Image2Lego: Customized LEGO Set Generation from Images
- [ICCV2021] Unsupervised Learning of Fine Structure Generation for 3D Point Clouds by 2D Projection Matching [github]
- [Arxiv] Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ Rendering from a Single Image
- [Arxiv] DOVE: Learning Deformable 3D Objects by Watching Videos [Project]
- [Arxiv] Active 3D Shape Reconstruction from Vision and Touch
- [NeurIPS2020] 3D Shape Reconstruction from Vision and Touch [pytorch]
- [Arxiv] LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction
- [Arxiv] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects
- [Arxiv] View Generalization for Single Image Textured 3D Models [Project]
- [Arxiv] Shape As Points: A Differentiable Poisson Solver
- [Arxiv] Neural Implicit 3D Shapes from Single Images with Spatial Patterns
- [IJCAI2021] Spline Positional Encoding for Learning 3D Implicit Signed Distance Fields
- [Arxiv] Z2P: Instant Rendering of Point Clouds
- [CVPR2021] Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance
- [CVPR2021] Birds of a Feather: Capturing Avian Shape Models from Images [Project]
- [Arxiv] DeepCAD: A Deep Generative Network for Computer-Aided Design Models
- [Arxiv] StrobeNet: Category-Level Multiview Reconstruction of Articulated Objects
- [CVPR2021] Sketch2Model: View-Aware 3D Modeling from Single Free-Hand Sketches
- [Arxiv] Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by Sign-Agnostic Optimization of Convolutional Occupancy Networks
- [IJCAI2021] PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery
- [Arxiv] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
- [CVPR2021] Shape and Material Capture at Home
- [CVPR2021] StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision [Project]
- [Arxiv] CAPRI-Net: Learning Compact CAD Shapes with Adaptive Primitive Assembly
- [CVPR2021] Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction [Project]
- [CVPR2021] Online Learning of a Probabilistic and Adaptive Scene Representation
- [CVPR2021] Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors
- [Arxiv] Sketch2Mesh: Reconstructing and Editing 3D Shapes from Sketches
- [CVPR2021] Deep Implicit Moving Least-Squares Functions for 3D Reconstruction [Project]
- [Arxiv] PC2WF: 3D WIREFRAME RECONSTRUCTION FROM RAW POINT CLOUDS
- [CVPR2021] Diffusion Probabilistic Models for 3D Point Cloud Generation [Project]
- [Arxiv] ShaRF: Shape-conditioned Radiance Fields from a Single View [Project]
- [Arxiv] Shelf-Supervised Mesh Prediction in the Wild
- [Arxiv] HyperPocket: Generative Point Cloud Completion
- [Arxiv] Im2Vec: Synthesizing Vector Graphics without Vector Supervision [resource]
- [Arxiv] Secrets of 3D Implicit Object Shape Reconstruction in the Wild
- [Arxiv] Joint Learning of 3D Shape Retrieval and Deformation
- [Arxiv] Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes
Before 2021
- [Arxiv] Learning Delaunay Surface Elements for Mesh Reconstruction
- [Arxiv] Compositionally Generalizable 3D Structure Prediction
- [Arxiv] Online Adaptation for Consistent Mesh Reconstruction in the Wild
- [Arxiv] Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds
- [Arxiv] Deep Optimized Priors for 3D Shape Modeling and Reconstruction
- [Arxiv] DO 2D GANS KNOW 3D SHAPE? UNSUPERVISED 3D SHAPE RECONSTRUCTION FROM 2D IMAGE GANS [Project]
- [Arxiv] DUDE: Deep Unsigned Distance Embeddings for Hi-Fidelity Representation of Complex 3D Surfaces
- [3DV2020] Learning to Infer Semantic Parameters for 3D Shape Editing [Project]
- [3DV2020] Cycle-Consistent Generative Rendering for 2D-3D Modality Translation [Project]
- [3DV2020] A Divide et Impera Approach for 3D Shape Reconstruction from Multiple Views
- [Arxiv] A Closed-Form Solution to Local Non-Rigid Structure-from-Motion
- [Arxiv] Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
- [Arxiv] D-NeRF: Neural Radiance Fields for Dynamic Scenes
- [Arxiv] Modular Primitives for High-Performance Differentiable Rendering
- [CVPR2021] NeuralFusion: Online Depth Fusion in Latent Space
- [Arxiv] Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video [Project]
- [NeurIPS2020] Continuous Object Representation Networks: Novel View Synthesis without Target View Supervision [Project]
- [NeurIPS2020] SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images [Project]
- [NeurIPS2020] Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance [Project]
- [NeurIPS2020] Convolutional Generation of Textured 3D Meshes [Project]
- [Arxiv] Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos
- [NeurIPS2020] UCLID-Net: Single View Reconstruction in Objec Space [Project]
- [NeurIPS2020] CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations [Project]
- [NeurIPS2020] Generative 3D Part Assembly via Dynamic Graph Learning [pytorch]
- [NeurIPS2020] Learning Deformable Tetrahedral Meshes for 3D Reconstruction [Project]
- [NeurIPS2020] SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds [pytorch]
- [Arxiv] Training Data Generating Networks: Linking 3D Shapes and Few-Shot Classification
- [Arxiv] MESHMVS: MULTI-VIEW STEREO GUIDED MESH RECONSTRUCTION
- [Arxiv] Learning Occupancy Function from Point Clouds for Surface Reconstruction
- [NeurIPS2020] SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images [Project]
- [Arxiv] GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering [github]
- [3DV2020] A Progressive Conditional Generative Adversarial Network for Generating Dense and Colored 3D Point Clouds
- [3DV2020] Better Patch Stitching for Parametric Surface Reconstruction
- [NeurIPS2020] Skeleton-bridged Point Completion: From Global Inference to Local Adjustment [Project Page]
- [Arxiv] NeRF++: Analyzing and Improving Neural Radiance Fields [pytorch]
- [Arxiv] Improved Modeling of 3D Shapes with Multi-view Depth Maps
- [SIGGRAPH2020] One Shot 3D Photography [Project]
- [BMVC2020] Large Scale Photometric Bundle Adjustment
- [ECCV2020] Interactive Annotation of 3D Object Geometry using 2D Scribbles [Project]
- [BMVC2020] Visibility-aware Multi-view Stereo Network
- [ECCV2020] Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images
- [ECCV2020] 3D Bird Reconstruction: a Dataset, Model, and Shape Recovery from a Single View [Project][Pytorch]
- [BMVC2020] 3D-GMNet: Single-View 3D Shape Recovery as A Gaussian Mixture
- [SIGGRAPH2020] Self-Sampling for Neural Point Cloud Consolidation
- [ECCV2020] Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction [github]
- [Arxiv] NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections [Project]
- [Arxiv] MeshODE: A Robust and Scalable Framework for Mesh Deformation
- [Arxiv] MRGAN: Multi-Rooted 3D Shape Generation with Unsupervised Part Disentanglement
- [ECCV2020] Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance [pytorch]
- [ECCV2020] Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop
- [ECCV2020] Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
- [ECCV2020] Shape and Viewpoint without Keypoints
- [Arxiv] Object-Centric Multi-View Aggregation
- [ECCV2020] Points2Surf Learning Implicit Surfaces from Point Clouds
- [NeurIPS2020] Neural Mesh Flow: 3D Manifold Mesh Generation via Diffeomorphic Flows [Project]
- [Arxiv] Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images
- [Arxiv] Neural Non-Rigid Tracking
- [NeurIPS2020] MeshSDF: Differentiable Iso-Surface Extraction
- [Arxiv] 3D Reconstruction of Novel Object Shapes from Single Images
- [NeurIPS2020] ShapeFlow: Learnable Deformations Among 3D Shapes [pytorch]
- [Arxiv] 3D Shape Reconstruction from Free-Hand Sketches
- [Arxiv] Convolutional Occupancy Networks
- [Siggraph2020] Point2Mesh: A Self-Prior for Deformable Meshes
- [Arxiv] PointTriNet: Learned Triangulation of 3D Point
- [Arxiv] A Simple and Scalable Shape Representation for 3D Reconstruction
- [Siggraph2020] Vid2Curve: Simultaneously Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video
- [CVPR2020] From Image Collections to Point Clouds with Self-supervised Shape and Pose Networks [tensorflow]
- [CVPR2020] Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes [github]
- [Arxiv] PolyGen: An Autoregressive Generative Model of 3D Meshes
- [Arxiv] Combinatorial 3D Shape Generation via Sequential Assembly
- [Arxiv] Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors
- [Arxiv] Neural Object Descriptors for Multi-View Shape Reconstruction
- [CVPR2020] SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings [pytorch]
- [Arxiv] Modeling 3D Shapes by Reinforcement Learning
- [ECCV2020] ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds [pytorch]
- [Arxiv] Self-Supervised 2D Image to 3D Shape Translation with Disentangled Representations
- [Arxiv] Universal Differentiable Renderer for Implicit Neural Representations
- [Arxiv] Learning 3D Part Assembly from a Single Image
- [Arxiv] Curriculum DeepSDF
- [Arxiv] PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
- [Arxiv] Self-supervised Single-view 3D Reconstruction via Semantic Consistency
- [Arxiv] Meta3D: Single-View 3D Object Reconstruction from Shape Priors in Memory
- [Arxiv] STD-Net: Structure-preserving and Topology-adaptive Deformation Network for 3D Reconstruction from a Single Image [new]
- [Arxiv] Curvature Regularized Surface Reconstruction from Point Cloud
- [Arxiv] Hypernetwork approach to generating point clouds
- [Arxiv] Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
- [Arxiv] Meshlet Priors for 3D Mesh Reconstruction
- [Arxiv] Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction
- [Arxiv] SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization
- [CVPR2019] Occupancy Networks: Learning 3D Reconstruction in Function Space [pytorch]
🔥 ⭐ - [NeurIPS2019] DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction [tensorflow]
- [NeurIPS2019] Learning to Infer Implicit Surfaces without 3D Supervision
- [CVPR2019] A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images [pytorch & tensorflow]
- [Arxiv] Deep Level Sets: Implicit Surface Representations for 3D Shape Inference
- [CVPR2019] Learning Implicit Fields for Generative Shape Modeling [tensorflow]
🔥 - [ICCV2019] Point-based Multi-view Stereo Network [pytorch]
⭐ - [Arxiv] TSRNet: Scalable 3D Surface Reconstruction Network for Point Clouds using Tangent Convolution
- [Arxiv] DR-KFD: A Differentiable Visual Metric for 3D Shape Reconstruction
- [ICCV2019] GraphX-Convolution for Point Cloud Deformation in 2D-to-3D Conversion
- [ICCV2019] Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation [pytorch]
- [ICCV2019] Few-Shot Generalization for Single-Image 3D Reconstruction via Priors
- [ICCV2019] Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks
- [AAAI2018] Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction [tensorflow]
⭐ 🔥 - [NeurIPS2017] MarrNet: 3D Shape Reconstruction via 2.5D Sketches [torch]
⭐ 🔥
3D Scene Understanding
- [Arxiv] CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
- [CVPR2023] Learning 3D Scene Priors with 2D Supervision [Project]
- [CVPR2023] Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
- [Arxiv] Decoupling Human and Camera Motion from Videos in the Wild [Project]
- [CVPR2022] PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes [github]
- [Arxiv] Semantic Instance Segmentation of 3D Scenes Through Weak Bounding Box Supervision [Project]
- [CVPR2022] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation [github]
- [CVPR2022] 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
- [CVPR2022] BEHAVE: Dataset and Method for Tracking Human Object Interactions [Project]
Before 2022
- [Arxiv] Transferable End-to-end Room Layout Estimation via Implicit Encoding [Project]
- [Arxiv] ScanQA: 3D Question Answering for Spatial Scene Understanding
- [Arxiv] 3D Question Answering
- [Arxiv] MVLayoutNet:3D layout reconstruction with multi-view panoramas
- [SGP2021] Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
- [Arxiv] 4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
- [Arxiv] Pose2Room: Understanding 3D Scenes from Human Activities [Project]
- [NeurIPS2021] SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency [Project]
- [Arxiv] D3Net: A Speaker-Listener Architecture for Semi-supervised Dense Captioning and Visual Grounding in RGB-D Scans [Project]
- [Arxiv] Recognizing Scenes from Novel Viewpoints
- [Arxiv] Putting 3D Spatially Sparse Networks on a Diet
- [Arxiv] Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing [github]
- [NeurIPS2021] Neural Scene Flow Prior [github]
- [ICCV2021] Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images [Project]
- [Arxiv] RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View
- [EMNLP2021] Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments [Project]
- [Arxiv] KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D [Project]
- [CVPR2021] OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets [github]
- [Arxiv] Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck [github]
- [TPAMI2021] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [github]
- [Arxiv] PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds [github]
- [Arxiv] Residual 3D Scene Flow Learning with Context-Aware Feature Extraction
- [ICCV2021] Learning to Generate Scene Graph from Natural Language Supervision [github]
- [ICCV2021] The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation [Project]
- [ICCV2021] Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs
- [ICCV2021] PICCOLO: Point Cloud-Centric Omnidirectional Localization
- [ICCV2021] Unconditional Scene Graph Generation
- [Arxiv] Learning Indoor Layouts from Simple Point-Clouds
- [Arxiv] LanguageRefer: Spatial-Language Model for 3D Visual Grounding
- [Arxiv] WiCluster: Passive Indoor 2D/3D Positioning using WiFi without Precise Labels
- [CVPR2021] Zillow Indoor Dataset: Annotated Floor Plans With 360deg Panoramas and 3D Room Layouts [github]
- [ICRA2021] Efficient and Robust LiDAR-Based End-to-End Navigation [Project]
- [ICLR2021] VTNet: Visual Transformer Network for Object Goal Navigation
- [CVPR2021] Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk
- [CVPR2021] HCRF-Flow: Scene Flow from Point Clouds with Continuous High-order CRFs and Position-aware Flow Embedding
- [Arxiv] FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting
- [Arxiv] SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation
- [Arxiv] Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry? [Project]
- [Arxiv] Pri3D: Can 3D Priors Help 2D Representation Learning?
- [Arxiv] LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments
- [CVPRW] OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas [github]
- [Arxiv] Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image [pytorch]
- [Arxiv] SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000× Fewer Labels [github]
- [CVPR2021] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds
- [CVPR2021] Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud [github]
- [ICRA] Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments [Project]
- [Arxiv] Contextual Scene Augmentation and Synthesis via GSACNet
- [Arxiv] In-Place Scene Labelling and Understanding with Implicit Scene Representation
- [CVPR2021] Bidirectional Projection Network for Cross Dimension Scene Understanding [github]
- [Arxiv] Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud [github]
- [CVPR2021] Visual Room Rearrangement [Project]
- [Arxiv] MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans
- [Arxiv] Structured Scene Memory for Vision-Language Navigation
- [Arxiv] House-GAN++: Generative Adversarial Layout Refinement Networks
- [Arxiv] Weakly Supervised Learning of Rigid 3D Scene Flow
- [ICLR2021] End-to-End Egospheric Spatial Memory
- [Arxiv] Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas [Project]
- [Arxiv] A modular vision language navigation and manipulation framework for long horizon compositional tasks in indoor environment
- [Arxiv] Deep Reinforcement Learning for Producing Furniture Layout in Indoor Scenes
- [Arxiv] Where2Act: From Pixels to Actions for Articulated 3D Objects [Project]
Before 2021
- [Arxiv] PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things
- [Arxiv] AI2-THOR: An Interactive 3D Environment for Visual AI [Project]
- [Arxiv] Audio-Visual Floorplan Reconstruction
- [Arxiv] PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
- [Arxiv] RAFT-3D: Scene Flow using Rigid-Motion Embeddings
- [Arxiv] GenScan: A Generative Method for Populating Parametric 3D Scan Datasets
- [Arxiv] LayoutGMN: Neural Graph Matching for Structural Layout Similarity
- [Arxiv] Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
- [Arxiv] P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding
- [Arxiv] Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
- [Arxiv] Localising In Complex Scenes Using Balanced Adversarial Adaptation
- [Arxiv] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
- [NeurIPS2020] Multi-Plane Program Induction with 3D Box Priors [Project]
- [Arxiv] HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
- [Arxiv] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
- [Arxiv] Generative Layout Modeling using Constraint Graphs
- [NeurIPS2020] Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D [pytorch]
- [NeurIPS2020] Learning Affordance Landscapes for Interaction Exploration in 3D Environments [Project]
- [NeurIPS2020W] Unsupervised Domain Adaptation for Visual Navigation
- [Arxiv] Embodied Visual Navigation with Automatic Curriculum Learningin Real Environments
- [Arxiv] 3D Room Layout Estimation Beyond the Manhattan World Assumption
- [Arxiv] OpenBot: Turning Smartphones into Robots [Project]
- [Arxiv] Audio-Visual Waypoints for Navigation
- [Arxiv] Learning Affordance Landscapes for Interaction Exploration in 3D Environments [Project]
- [ECCV2020] Occupancy Anticipation for Efficient Exploration and Navigation [Project]
- [Arxiv] Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
- [Arxiv] Generating Person-Scene Interactions in 3D Scenes
- [Arxiv] GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes
- [ECCV2020] ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
- [Arxiv] Structural Plan of Indoor Scenes with Personalized Preferences
- [Arxiv] HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures [Project]
- [CVPR2020] End-to-End Optimization of Scene Layout [Project]
- [Arxiv] Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships
- [CVPR2020] Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
- [Arxiv] LayoutMP3D: Layout Annotation of Matterport3D
- [CVPR2020] Local Implicit Grid Representations for 3D Scenes
- [Arxiv] Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes
- [CVPR2020] RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds [tensorflow]
🔥 - [CVPR2020] Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only
- [ICRA2020] 3DCFS: Fast and Robust Joint 3D Semantic-Instance Segmentation via Coupled Feature Selection
- [Arxiv] Indoor Scene Recognition in 3D
- [Journal] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
- [Arxiv] BlockGAN Learning 3D Object-aware Scene Representations from Unlabelled Images
- [Arxiv] 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans [Project] Related: [Arxiv] [Arxiv]
- [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
- [ICCV2019] UprightNet: Geometry-Aware Camera Orientation Estimation from Single Images
- [ICCV2019] Habitat: A Platform for Embodied AI Research [habitat-api] [habitat-sim]
⭐ - [ICCV2019] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [project page]
⭐ - [ICCV2019] Neural Inverse Rendering of an Indoor Scene From a Single Image
- [ICCV2019] SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation [pytorch]
- [ICCV2019] RIO: 3D Object Instance Re-Localization in Changing Indoor Environments [dataset]
- [ICCV2019] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
- [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
- [NeurIPS2018] Learning to Exploit Stability for 3D Scene Parsing
3D Scene Reconstruction & Generation
- [Arxiv] Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion [Project]
- [Arxiv] FastSurf: Fast Neural RGB-D Surface Reconstruction using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning
- [CVPR2023] I$^2$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs [Project]
- [Arxiv] CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [Project]
- [Arxiv] RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction
- [Arxiv] Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
- [Arxiv] Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [Project]
- [Arxiv] Compositional 3D Scene Generation using Locally Conditioned Diffusion [Project]
- [Arxiv] Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes [Project]
- [BMVC2022] SPARC: Sparse Render-and-Compare for CAD model alignment in a single RGB image [github]
- [Arxiv] NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM
- [Arxiv] Text-To-4D Dynamic Scene Generation
- [Arxiv] Behind the Scenes: Density Fields for Single View Reconstruction [Project]
- [Arxiv] MIME: Human-Aware 3D Scene Generation [Project]
- [CVPR2022] PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo
- [CVPR2022] Neural 3D Scene Reconstruction with the Manhattan-world Assumption [Project]
- [CVPR2022] 3D Scene Painting via Semantic Image Synthesis
- [Siggraph2022] SNeRF: Stylized Neural Implicit Representations for 3D Scenes [Project]
- [Siggraph2022] Neural 3D Reconstruction in the Wild [Project]
- [Arxiv] GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction [Project]
- [Arxiv] RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers
- [Arxiv] iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [Project]
- [Arxiv] NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors [Project]
- [CVPR2022] PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos [Project]
- [CVPR2022] Learning 3D Object Shape and Layout without 3D Supervision [Project]
- [Arxiv] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction [Project]
- [Arxiv] BlobGAN: Spatially Disentangled Scene Representations [Project]
- [CVPR2022] NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction
- [Arxiv] ATEK: Augmenting Transformers with Expert Knowledge for Indoor Layout Synthesis
Before 2022
- [Arxiv] IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo [github]
- [Arxiv] What's Behind the Couch? Directed Ray Distance Functions (DRDF) for 3D Scene Reconstruction [Project]
- [Arxiv] Input-level Inductive Biases for 3D Reconstruction
- [Arxiv] ROCA: Robust CAD Model Retrieval and Alignment from a Single Image
- [Arxiv] Multi-View Stereo with Transformer
- [3DV2021] 3DVNet: Multi-View Depth Prediction and Volumetric Refinement
- [Arxiv] VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View Selection and Fusion
- [Arxiv] CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene
- [Arxiv] Joint stereo 3D object detection and implicit surface reconstruction
- [CoRL2021] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [Project]
- [NeurIPS2021] Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image [Project]
- [NeurIPS2021] Panoptic 3D Scene Reconstruction From a Single RGB Image
- [Arxiv] NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [Project]
- [BMVC2021] PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image [github]
- [ICCV2021] Scene Synthesis via Uncertainty-Driven Attribute Synchronization [github]
- [NeurIPS2021] ATISS: Autoregressive Transformers for Indoor Scene Synthesis [Project]
- [ICCV2021] Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting
- [Arxiv] Black-Box Test-Time Shape REFINEment for Single View 3D Reconstruction
- [Arxiv] Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images
- [ICCV2021] Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility [github]
- [ICCV2021] 3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces [Project]
- [ICCV2021] VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction
- [Arxiv] AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network
- [Arxiv] NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis
- [ICCV2021] Out-of-Core Surface Reconstruction via Global
$TGV$ Minimization - [ICCV2021] Discovering 3D Parts from Image Collections [Project]
- [ICCV2021] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery [pytorch]
- [Arxiv] TransformerFusion: Monocular RGB Scene Reconstruction using Transformers [Project]
- [Arxiv] Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
- [Arxiv] NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
- [CVPR2021] Mirror3D: Depth Refinement for Mirror Surfaces [Project]
- [CVPR2021] Plan2Scene: Converting Floorplans to 3D Scenes [Project]
- [Arxiv] Translational Symmetry-Aware Facade Parsing for 3D Building Reconstruction
- [Arxiv] Learning to Stylize Novel Views [Project]
- [Arxiv] Stylizing 3D Scene via Implicit Representation and HyperNetwork
- [CVPR2021] SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data [Project]
- [Arxiv] The Boombox: Visual Reconstruction from Acoustic Vibrations [Project]
- [Arxiv] Joint Pose and Shape Estimation of Vehicles from LiDAR Data
- [CVPR2021] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video [Project]
- [Arxiv] DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range [pytorch]
- [Arxiv] Planar Surface Reconstruction from Sparse Views [Project]
- [Arxiv] Neural RGB-D Surface Reconstruction
- [Arxiv] RetrievalFuse: Neural 3D Scene Reconstruction with a Database
- [ICCV2021] PlenOctrees for Real-time Rendering of Neural Radiance Fields [C++]
- [Arxiv] iMAP: Implicit Mapping and Positioning in Real-Time
- [CVPR2021] Monte Carlo Scene Search for 3D Scene Understanding
- [CVPR2021] Holistic 3D Scene Understanding from a Single Image with Implicit Representation
- [CVPR2021] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction [pytorch]
- [Arxiv] IBRNet: Learning Multi-View Image-Based Rendering [Project]
- [Arxiv] STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering [Project]
Before 2021
- [ToG2018] Deep convolutional priors for indoor scene synthesis [github]
- [Arxiv] MO-LTR: Multiple Object Localization, Tracking and Reconstruction from Monocular RGB Videos
- [Arxiv] DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
- [3DV2020] Scene Flow from Point Clouds with or without Learning
- [Arxiv] Stable View Synthesis
- [Arxiv] Neural Scene Graphs for Dynamic Scenes
- [3DV2020] RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty [pytorch]
- [Arxiv] FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation
- [Arxiv] MoNet: Motion-based Point Cloud Prediction Network
- [Arxiv] MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
- [Arxiv] Efficient Initial Pose-graph Generation for Global SfM
- [Arxiv] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [Project]
- [Arxiv] RGBD-Net: Predicting color and depth images for novel views synthesis
- [Arxiv] SSCNav: Confidence-Aware Semantic Scene Completion for Visual Semantic Navigation [Project]
- [Arxiv] From Points to Multi-Object 3D Reconstruction
- [Arxiv] Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image [Project]
- [Arxiv] SceneFormer: Indoor Scene Generation with Transformers [pytorch]
- [NeurIPS2020] Neural Sparse Voxel Fields [Project]
- [Arxiv] Towards Part-Based Understanding of RGB-D Scans
- [Arxiv] Dynamic Plane Convolutional Occupancy Networks
- [NeurIPS2020] Neural Unsigned Distance Fields for Implicit Function Learning [Project]
- [Arxiv] Holistic static and animated 3D scene generation from diverse text descriptions [pytorch]
- [Arxiv] Semi-Supervised Learning of Multi-Object 3D Scene Representations
- [ECCV2020] CAD-Deform: Deformable Fitting of CAD Models to 3D Scans
- [ECCV2020] Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve
- [ECCV2020] Learnable Cost Volume Using the Cayley Representation
- [ECCV2020] Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction
- [ECCV2020] Convolutional Occupancy Networks
- [CVPR2020] MARMVS: Matching Ambiguity Reduced Multiple View Stereo for Efficient Large Scale Scene Reconstruction
- [ECCV2020] CoReNet: Coherent 3D scene reconstruction from a single RGB image
- [CVPR2020] DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes
- [ECCV2020] SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
- [Arxiv] Removing Dynamic Objects for Static Scene Reconstruction using Light Fields
- [Arxiv] Atlas: End-to-End 3D Scene Reconstruction from Posed Images
- [Arxiv] Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes
- [Arxiv] Plane Pair Matching for Efficient 3D View Registration
- [CVPR2020] Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image [pytorch]
- [Arxiv] Indoor Layout Estimation by 2D LiDAR and Camera Fusion
- [Arxiv] General 3D Room Layout from a Single View by Render-and-Compare
- [ICCV2019] Learning to Reconstruct 3D Manhattan Wireframes from a Single Image
- [CVPR2019] PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image [pytorch]
🔥 - [ICCV2019] 3D Scene Reconstruction with Multi-layer Depth and Epipolar Transformers
- [ICCV Workshop2019] Silhouette-Assisted 3D Object Instance Reconstruction from a Cluttered Scene
- [ICCV2019] 3D-RelNet: Joint Object and Relation Network for 3D prediction [pytorch]
- [3DV2019] Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network
- [CVPR2018] Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene [pytorch]
- [IROS2017] Indoor Scan2BIM: Building Information Models of House Interiors
- [CVPR2017] 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions [github]
NeRF
- [CVPR2023] Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container [Project]
- [Arxiv] CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout
- [Arxiv] LERF: Language Embedded Radiance Fields [Project]
- [CVPR2023] Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervisio
- [CVPR2023] HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization [github]
- [Arxiv] BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis [Project]
- [Arxiv] NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion [Project]
- [Arxiv] HR-NeuS: Recovering High-Frequency Surface Geometry via Neural Implicit Surfaces
- [Arxiv] 3D-aware Blending with Generative NeRFs [Project]
- [Arxiv] Factor Fields: A Unified Framework for Neural Fields and Beyond
- [Arxiv] Removing Objects From Neural Radiance Fields
- [Arxiv] Interactive Segmentation of Radiance Fields [Project]
- [Arxiv] Robust Dynamic Radiance Fields [Project]
- [Arxiv] NeRF-Art: Text-Driven Neural Radiance Fields Stylization [Projetc]
- [Arxiv] 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions [Project]
- [Arxiv] EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points
- [Arxiv] SSDNeRF: Semantic Soft Decomposition of Neural Radiance Fields [Project]
- [Arxiv] NeRFEditor: Differentiable Style Decomposition for Full 3D Scene Editing [Project]
- [Arxiv] Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields [Project]
- [WACV2023] ScanNeRF: a Scalable Benchmark for Neural Radiance Fields [Project]
- [Arxiv] LaTeRF: Label and Text Driven Object Radiance Fields
- [Arxiv] Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
- [CVPR2022] RigNeRF: Fully Controllable Neural 3D Portraits [Project]
- [Arxiv] Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
- [Arxiv] D2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video [Project]
- [Arxiv] Artemis: Articulated Neural Pets with Appearance and Motion synthesis [Project]
- [Arxiv] KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints [Project]
- [Arxiv] Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation
- [Arxiv] PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for Single-Image Novel View Synthesis
- [Arxiv] Block-NeRF: Scalable Large Scene Neural View Synthesis [Project]
- [Arxiv] Pix2NeRF: Unsupervised Conditional π-GAN for Single Image to Neural Radiance Fields Translation
- [Arxiv] NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes [Project]
- [Arxiv] HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video [github]
- [Arxiv] NeROIC: Neural Rendering of Objects from Online Image Collections [Projetc]
- [Arxiv] DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering
- [Arxiv] InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [Project]
Before 2022
- [Arxiv] Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs [Project]
- [Arxiv] Light Field Neural Rendering [Project]
- [Arxiv] CG-NeRF: Conditional Generative Neural Radiance Fields
- [Arxiv] Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields [Project]
- [Arxiv] MoFaNeRF: Morphable Facial Neural Radiance Field
- [Arxiv] Dense Depth Priors for Neural Radiance Fields from Sparse Input Views
- [Arxiv] NeRF-SR: High-Quality Neural Radiance Fields using Super-Sampling [Project]
- [Arxiv] RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs [Project]
- [Arxiv] NeRFReN: Neural Radiance Fields with Reflections [Project]
- [Arxiv] NeuSample: Neural Sample Field for Efficient View Synthesis [Project]
- [Arxiv] Urban Radiance Fields [Project]
- [Arxiv] GeoNeRF: Generalizing NeRF with Geometry Priors [Project]
- [Arxiv] NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images [Project]
- [Arxiv] VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field [github]
- [Arxiv] Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction [github]
- [Arxiv] LOLNeRF: Learn from One Look
- [Arxiv] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [Project]
- [NeurIPS2021] Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [github]
- [Arxiv] PERF: Performant, Explicit Radiance Fields
- [Arxiv] Plenoxels: Radiance Fields without Neural Networks [Project]
- [NeurIPS2021] Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering [Project]
- [ICCV2021] CodeNeRF: Disentangled Neural Radiance Fields for Object Categories [github]
- [ICCV2021] Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering [Project]
- [ICCV2021] Differentiable Surface Rendering via Non-Differentiable Sampling
- [ICCV2021] Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis [Project]
- [Arxiv] Fast and Explicit Neural View Synthesis
- [Arxiv] Depth-supervised NeRF: Fewer Views and Faster Training for Free [Project] [pytorch]
- [Arxiv] A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields [Project]
- [Arxiv] NeRF in detail: Learning to sample for view synthesis
- [Arxiv] NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination [Project]
- [Arxiv] Neural Trajectory Fields for Dynamic Novel View Synthesis
- [Arxiv] Editing Conditional Radiance Fields [Project]
- [CVPR2021] Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
- [Arxiv] GNeRF: GAN-based Neural Radiance Field without Posed Camera
- [Arxiv] BARF: Bundle-Adjusting Neural Radiance Fields [Project]
- [Arxiv] MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo
- [CVPR2021] Neural Lumigraph Rendering [Project]
- [Arxiv] Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
- [Arxiv] KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
- [Arxiv] FastNeRF: High-Fidelity Neural Rendering at 200FPS
- [CVPR2021] NeX: Real-time View Synthesis with Neural Basis Expansion [Project]
- [Arxiv] DONeRF: Towards Real-Time Rendering of Neural Radiance Fields using Depth Oracle Networks [Project]
- [Arxiv] NeRF--: Neural Radiance Fields Without Known Camera Parameters [Project]
Before 2021
- [Arxiv] pixelNeRF: Neural Radiance Fields from One or Few Images [Project]
- [Arxiv] NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis [Project]
- [Arxiv] Neural Radiance Flow for 4D View Synthesis and Video Processing [Project]
- [Arxiv] Deformable Neural Radiance Fields [Project]
- [Arxiv] DeRF: Decomposed Radiance Fields
- [Arxiv] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
About Human Body
- [Arxiv] NPC: Neural Point Characters from Video [Project]
- [Arxiv] Normal-guided Garment UV Prediction for Human Re-texturing
- [Arxiv] Sketch2Cloth: Sketch-based 3D Garment Generation with Unsigned Distance Fields
- [Arxiv] PointAvatar: Deformable Point-based Head Avatars from Videos [Project]
- [Arxiv] PhoMoH: Implicit Photorealistic 3D Models of Human Heads
- [Arxiv] 3DHumanGAN: Towards Photo-Realistic 3D-Aware Human Image Generation
- [Arxiv] Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion [Project]
- [Arxiv] Generating Holistic 3D Human Motion from Speech [Project]
- [Arxiv] MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis [Project]
- [Arxiv] RANA: Relightable Articulated Neural Avatars [Project]
- [Arxiv] Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
- [Arxiv] One-shot Implicit Animatable Avatars with Model-based Priors [Project]
- [Arxiv] PhysDiff: Physics-Guided Human Motion Diffusion Model [Project]
- [Arxiv] Instant Volumetric Head Avatars [Project]
- [Arxiv] EVA3D: Compositional 3D Human Generation from 2D Image Collections [Project]
- [ECCV2022] Compositional Human-Scene Interaction Synthesis with Semantic Control [Project]
- [ECCV2022] Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis [Project]
- [CVPR2022] Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing [Project]
- [ECCV2022] DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras [Project]
- [CVPR2022] Capturing and Inferring Dense Full-Body Human-Scene Contact [Project]
- [Arxiv] Realistic One-shot Mesh-based Head Avatars [Project]
- [CVPR2022] SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis
- [Arxiv] DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image [Project]
- [CVPR2022] Structured Local Radiance Fields for Human Avatar Modeling
- [CVPR2022] ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations
- [Arxiv] AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling [Project]
Before 2022
- [Arxiv] The Wanderings of Odysseus in 3D Scenes [Project]
- [Arxiv] Putting People in their Place: Monocular Regression of 3D People in Depth [github]
- [Arxiv] Tracking People by Predicting 3D Appearance, Location & Pose [Project]
- [Arxiv] Adversarial Parametric Pose Prior
- [NeurIPS2021] Garment4D: Garment Reconstruction from Point Cloud Sequences [Project]
- [Arxiv] MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image [github]
- [Arxiv] Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors
- [Arxiv] GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras [Project]
- [3DV2021] LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies [Project]
- [Arxiv] A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose
- [Arxiv] MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation [github]
- [Arxiv] Multi-Person 3D Motion Prediction with Multi-Range Transformers [Project]
- [Arxiv] DD-NeRF: Double-Diffusion Neural Radiance Field as a Generalizable Implicit Body Representation
- [Arxiv] Creating and Reenacting Controllable 3D Humans with Differentiable Rendering
- [Arxiv] Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation
- [BMVC2021] AniFormer: Data-driven 3D Animation with Transformer [Project]
- [ACMMM2021] VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds
- [Arxiv] Playing for 3D Human Recovery [Project]
- [ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering [Project]
- [Arxiv] ICON: Implicit Clothed humans Obtained from Normals [github]
- [ICCV2021] Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild [Project]
- [Arxiv] SPEC: Seeing People in the Wild with an Estimated Camera [Project]
- [NeurIPS2021] Tracking People with 3D Representations [github]
- [Arxiv] A Skeleton-Driven Neural Occupancy Representation for Articulated Hands
- [Arxiv] GraFormer: Graph Convolution Transformer for 3D Pose Estimation [github]
- [ICCV2021] Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
- [ICCV2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation [github]
- [ICCV2021] 3D Human Texture Estimation from a Single Image with Transformers
- [ICCV2021] DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to the Third Dimension
- [Arxiv] SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes [Project]
- [ICCV2021] Probabilistic Modeling for Human Mesh Recovery [Project]
- [ICCV2021] Unsupervised Dense Deformation Embedding Network for Template-Free Shape Correspondence
- [ACMMM2021] DC-GNet: Deep Mesh Relation Capturing Graph Convolution Network for 3D Human Shape Reconstruction
- [SiggraphAsia2019] Neural State Machine for Character-Scene Interactions [github]
- [ICCV2021] Learning Motion Priors for 4D Human Body Capture in 3D Scenes [Project]
- [Arxiv] Deep Virtual Markers for Articulated 3D Shapes
- [ICCV2021] Gravity-Aware Monocular 3D Human-Object Reconstruction [Project]
- [ICCV2021] Learning Anchored Unsigned Distance Functions with Gradient Direction Alignment for Single-view Garment Reconstruction
- [Arxiv] D3D-HOI: Dynamic 3D Human-Object Interactions from Videos [github]
- [ICCV2021] Stochastic Scene-Aware Motion Prediction [Project] [github]
- [ICCV2021] ARCH++: Animation-Ready Clothed Human Reconstruction Revisited
- [ICCV2021] EventHPE: Event-based 3D Human Pose and Shape Estimation
- [ACMMM2021] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [github]
- [ACMMM2021] Skeleton-Contrastive 3D Action Representation Learning [github]
- [Arxiv] Learning Local Recurrent Models for Human Mesh Recovery
- [Arxiv] H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction [Project]
- [Arxiv] Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds [github]
- [Arxiv] MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images [Project]
- [Arxiv] Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images
- [Arxiv] THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers
- [CVPR2021] Function4D: Real-time Human Volumetric Capture from Very Sparse RGBD Sensors [Project]
- [Arxiv] Bridge the Gap Between Model-based and Model-free Human Reconstruction
- [Arxiv] Neural Actor: Neural Free-view Synthesis of Human Actors with Pose Control
- [Arxiv] Scene-aware Generative Network for Human Motion Synthesis
- [Arxiv] Human Motion Prediction Using Manifold-Aware Wasserstein GAN
- [CVPR2021] Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors [Project]
- [Arxiv] TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [Project]
- [CVPR2021] We are More than Our Joints: Predicting how 3D Bodies Move [Project]
- [CVPR2021] LEAP: Learning Articulated Occupancy of People [Project]
- [Arxiv] 3DCrowdNet: 2D Human Pose-Guided 3D Crowd Human Pose and Shape Estimation in the Wild
- [CVPR2021] SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements [Project]
- [Arxiv] Action-Conditioned 3D Human Motion Synthesis with Transformer VAE [Project]
- [Arxiv] Dynamic Surface Function Networks for Clothed Human Bodies [github]
- [Arxiv] Neural Articulated Radiance Field [github]
- [Arxiv] Mesh Graphormer
- [CVPR2021] SimPoE: Simulated Character Control for 3D Human Pose Estimation [Project]
- [Arxiv] TRAJEVAE - Controllable Human Motion Generation from Trajectories [Project]
- [CVPR2021] Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors [Project]
- [CVPR2021] Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction [Project]
- [CVPR2021] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction [github]
- [Arxiv] Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
- [Arxiv] 3D Human Pose Estimation with Spatial and Temporal Transformers [pytorch]
- [CVPR2021] Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
- [Arxiv] DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer
- [Arxiv] Aggregated Multi-GANs for Controlled 3D Human Motion Prediction [Project]
- [AAAI] PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos
- [Arxiv] NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
- [CVPR2021] SMPLicit: Topology-aware Generative Model for Clothed People [Project]
- [CVPR2021] HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation [pytorch]
- [Arxiv] Single-Shot Motion Completion with Transformer [Project]
- [EG2021] Walk2Map: Extracting Floor Plans from Indoor Walk Trajectories
- [Arxiv] Forecasting Characteristic 3D Poses of Human Actions
- [Arxiv] Capturing Detailed Deformations of Moving Human Bodies
- [Arxiv] A-NeRF: Surface-free Human 3D Pose Refinement via Neural Rendering [Project]
- [Arxiv] Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [Project]
- [Arxiv] S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
- [Arxiv] PandaNet : Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
- [Arxiv] Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans [Project]
- [Arxiv] Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory
- [3DV2020] PLACE: Proximity Learning of Articulation and Contact in 3D Environments [Project]
- [ICCV2019] Resolving 3D Human Pose Ambiguities with 3D Scene Constraints [Project]
Before 2021
- [ICCV2021] Monocular, One-stage, Regression of Multiple 3D People [github]
- [ECCV2020] History Repeats Itself: Human Motion Prediction via Motion Attention [pytorch]
- [ECCV2020] 3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning [Project]
- [Arxiv] Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes [Project]
- [Arxiv] End-to-End Human Pose and Mesh Reconstruction with Transformers
- [Arxiv] Human Mesh Recovery from Multiple Shots [Project]
- [NeurIPS2020] 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data [Project]
- [Arxiv] Holistic 3D Human and Scene Mesh Estimation from Single View Images
- [Arxiv] Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
- [Arxiv] Pose2Pose: 3D Positional Pose-Guided 3D Rotational Pose Prediction for Expressive 3D Human Pose and Mesh Estimation
- [Arxiv] NeuralAnnot: Neural Annotator for in-the-wild Expressive 3D Human Pose and Mesh Training Sets
- [Arxiv] 4D Human Body Capture from Egocentric Video via 3D Scene Grounding [Project]
- [Arxiv] Populating 3D Scenes by Learning Human-Scene Interaction [Project]
- [ECCV2020] Long-term Human Motion Prediction with Scene Context [Project]
- [Arxiv] Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild [Project]
- [Arxiv] ANR: Articulated Neural Rendering for Virtual Avatars
- [Arxiv] Generating 3D People in Scenes without People [Project]
- [ICCV2019] Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense
- [CVPR2019] Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments [Project]
- [TOG2016] Pigraphs: learning interaction snapshots from observations [Project]
General Methods
- [CVPR2023] Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers [github]
- [Arxiv] HexPlane: A Fast Representation for Dynamic Scenes [Project]
- [Arxiv] Joint Representation Learning for Text and 3D Point Cloud
- [Arxiv] Ponder: Point Cloud Pre-training via Neural Rendering
- [Arxiv] 3D Point Cloud Pre-training with Knowledge Distillation from 2D Images
- [Arxiv] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? [Project]
- [Arxiv] Attentive Mask CLIP
- [Arxiv] Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor Point Clouds
- [Arxiv] Frozen CLIP Model is Efficient Point Cloud Backbone
- [Arxiv] Continuous diffusion for categorical data
- [Arxiv] EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
- [Arxiv] Neural Density-Distance Fields [Project]
- [Arxiv] Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
- [Arxiv] Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer [Project]
- [Arxiv] Masked Surfel Prediction for Self-Supervised Point Cloud Learning [github]
- [Arxiv] Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training [github]
- [Arxiv] 3D-Aware Video Generation [Project]
- [Arxiv] Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space [Project]
- [Arxiv] Masked Frequency Modeling for Self-Supervised Visual Pre-Training [Project]
- [Arxiv] GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds [Project]
- [Arxiv] Diffusion Models for Video Prediction and Infilling [Project]
- [Arxiv] MaskViT: Masked Visual Pre-Training for Video Prediction [Project]
- [Arxiv] Random Walks for Adversarial Meshes
- [ICLR2022] Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework [github]
- [CVPR2022] Rethinking Semantic Segmentation: A Prototype View [github]
- [Arxiv] How to Understand Masked Autoencoders
- [ICLR2022] QuadTree Attention for Vision Transformers [github]
- [Arxiv] Contrastive Neighborhood Alignment
Before 2022
- [Arxiv] Domain Adaptation on Point Clouds via Geometry-Aware Implicits
- [ICCV2021] Progressive Seed Generation Auto-encoder for Unsupervised Point Cloud Learning
- [Arxiv] Variance-Aware Weight Initialization for Point Convolutional Neural Networks
- [Arxiv] Learning to Detect Every Thing in an Open World [Project]
- [Arxiv] Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling [Project]
- [Arxiv] CpT: Convolutional Point Transformer for 3D Point Cloud Processing
- [Arxiv] Swin Transformer V2: Scaling Up Capacity and Resolution [github]
- [Arxiv] TransMix: Attend to Mix for Vision Transformers [github]
- [Arxiv] Self-supervised GAN Detector [github]
- [NeurIPS2021] Residual Relaxation for Multi-view Representation Learning
- [ICCV2021] Video Autoencoder: self-supervised disentanglement of static 3D structure and motion [Project]
- [NeurIPS2021] SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization [Project]
- [Arxiv] Efficient Geometry-aware 3D Generative Adversarial Networks [Project]
- [Arxiv] Self-attention Does Not Need
$O(n^2)$ Memory - [Arxiv] CAP-Net: Correspondence-Aware Point-view Fusion Network for 3D Shape Analysis
- [Arxiv] PointMixer: MLP-Mixer for Point Cloud Understanding
- [NeurIPS2021] Blending Anti-Aliasing into Vision Transformer
- [ICCV2021] Learning Inner-Group Relations on Point Clouds
- [Arxiv] Point-Voxel Transformer: An Efficient Approach To 3D Deep Learning
- [Siggraph2021] SP-GAN: Sphere-Guided 3D Shape Generation and Manipulation [Project] [github]
- [ICCV2021] GraphFPN: Graph Feature Pyramid Network for Object Detection
- [Arxiv] CKConv: Learning Feature Voxelization for Point Cloud Analysis
- [ICCV2021] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers [pytorch]
- [Arxiv] Volume Rendering of Neural Implicit Surfaces
- [CVPR2021] Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations
- [Arxiv] DeepMesh: Differentiable Iso-Surface Extraction
- [Arxiv] Neural Marching Cubes
- [Arxiv] Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields
- [Arxiv] Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering
- [ICML2021] Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline [pytorch]
- [Arxiv] Deep Medial Fields
- [Arxiv] Subdivision-Based Mesh Convolution Networks [Jittor]
- [Arxiv] VA-GCN: A Vector Attention Graph Convolution Network for learning on Point Clouds [pytorch]
- [Arxiv] Aggregating Nested Transformers
- [Arxiv] Rethinking the Design Principles of Robust Vision Transformer [pytorch]
- [Siggraph2021] Acorn: Adaptive Coordinate Networks for Neural Scene Representation
- [Arxiv] Walk in the Cloud: Learning Curves for Point Clouds Shape Analysis [Project]
- [Arxiv] Pay Attention to MLPs
- [Arxiv] ResMLP: Feedforward networks for image classification with data-efficient training
- [Arxiv] RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
- [Arxiv] MLP-Mixer: An all-MLP Architecture for Vision
- [Arxiv] Vector Neurons: A General Framework for SO(3)-Equivariant Networks
- [CVPR2021] MongeNet: Efficient Sampler for Geometric Deep Learning [Project]
- [Arxiv] Point Cloud Learning with Transformer
- [Arxiv] Dual Transformer for Point Cloud Analysis
- [Arxiv] AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
- [Arxiv] Learning from 2D: Pixel-to-Point Knowledge Transfer for 3D Pretraining
- [Arxiv] Field Convolutions for Surface CNNs
- [Arxiv] Rethinking Spatial Dimensions of Vision Transformers [pytorch]
🔥 - [CVPR2021] PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds [pytorch]
- [Arxiv] Concentric Spherical GNN for 3D Representation Learning
- [Arxiv] High-Performance Large-Scale Image Recognition Without Normalization
- [Arxiv] Generative Models as Distributions of Functions
- [Arxiv] Point-set Distances for Learning Representations of 3D Point Clouds
- [Arxiv] Compressed Object Detection
- [Arxiv] A linearized framework and a new benchmark for model selection for fine-tuning
- [Arxiv] The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions
- [Arxiv] Self-Supervised Pretraining of 3D Features on any Point-Cloud [pytorch]
- [3DV2020] Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks
Before 2021
- [ICCV2019] Efficient Learning on Point Clouds with Basis Point Sets [pytorch]
- [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
- [Arxiv] Diffusion is All You Need for Learning on Surfaces
- [Arxiv] SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization
- [3DV2020] Rotation-Invariant Point Convolution With Multiple Equivariant Alignments
- [Arxiv] One Point is All You Need: Directional Attention Point for Feature Learning
- [Arxiv] PCT: Point Cloud Transformer
- [Arxiv] Hausdorff Point Convolution with Geometric Priors
- [Arxiv] MARNet: Multi-Abstraction Refinement Network for 3D Point Cloud Analysis [Github]
- [Arxiv] Point Transformer
- [Arxiv] Learning geometry-image representation for 3D point cloud generation
- [Arxiv] Deeper or Wider Networks of Point Clouds with Self-attention?
- [NeurIPS2020] Primal-Dual Mesh Convolutional Neural Networks [pytorch]
- [NeurIPS2020] Rational neural networks [tensorflow]
- [NeurIPS2020] Exchangeable Neural ODE for Set Modeling [Project]
- [NeurIPS2020] SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks [Project]
- [NeurIPS2020] NVAE: A Deep Hierarchical Variational Autoencoder [pytorch]
- [NeurIPS2020] Implicit Graph Neural Networks [pytorch]
- [NeurIPS2020] The Autoencoding Variational Autoencoder [pytorch]
- [Arxiv] PointManifold: Using Manifold Learning for Point Cloud Classification
- [Arxiv] RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder
- [Arxiv] Pre-Training by Completing Point Clouds [pytorch]
- [NeurIPS2020] Rotation-Invariant Local-to-Global Representation Learning for 3D Point Cloud
- [Arxiv] IF-Defense: 3D Adversarial Point Cloud Defense via Implicit Function based Restoration [pytorch]
- [Arxiv] DV-ConvNet: Fully Convolutional Deep Learning on Point Clouds with Dynamic Voxelization and 3D Group Convolution
- [Arxiv] Spatial Transformer Point Convolution
- [Arxiv] Minimal Adversarial Examples for Deep Learning on 3D Point Clouds
- [BMVC2020] Black Magic in Deep Learning: How Human Skill Impacts Network Training
- [ECCV2020] PointMixup: Augmentation for Point Clouds [Code]
- [ECCV2020] DR-KFS: A Differentiable Visual Similarity Metric for 3D Shape Reconstruction
- [Arxiv] Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination
- [Arxiv] Global Context Aware Convolutions for 3D Point Cloud Understanding
- [ECCV2020] Shape Adaptor: A Learnable Resizing Module [pytorch]
- [ACMMM2020] Differentiable Manifold Reconstruction for Point Cloud Denoising [pytorch]
- [ECCV2020] Discrete Point Flow Networks for Efficient Point Cloud Generation
- [Siggraph2020] Neural Subdivision
- [Arxiv] PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
- [Arxiv] Accelerating 3D Deep Learning with PyTorch3D
- [Arxiv] Natural Graph Networks
- [ECCV2020] Progressive Point Cloud Deconvolution Generation Network [github]
- [Arxiv] Point Set Voting for Partial Point Cloud Analysis
- [Arxiv] PointMask: Towards Interpretable and Bias-Resilient Point Cloud Processing
- [Arxiv] Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
- [Arxiv] A Closer Look at Local Aggregation Operators in Point Cloud Analysis [github]
- [NeurIPS2020] Implicit Neural Representations with Periodic Activation Functions [pytorch]
🔥 - [Arxiv] Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
- [Arxiv] Local-Area-Learning Network: Meaningful Local Areas for Efficient Point Cloud Analysis
- [Arxiv] TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations
- [Arxiv] Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
- [Arxiv] Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
- [Arxiv] MeshWalker: Deep Mesh Understanding by Random Walks
- [Arxiv] MOPS-Net: A Matrix Optimization-driven Network for Task-Oriented 3D Point Cloud Downsampling
- [Arxiv] DPDist : Comparing Point Clouds Using Deep Point Cloud Distance
- [CVPR2020] PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling
- [AAAI2020] Shape-Oriented Convolution Neural Network for Point Cloud Analysis
- [Arxiv] Joint Supervised and Self-Supervised Learning for 3D Real-World Challenges
- [Arxiv] LIGHTCONVPOINT: CONVOLUTION FOR POINTS [pytorch]
- [Arxiv] Variational Auto-Decoder [pytorch]
- [Arxiv] Generative PointNet: Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification
- [CVPR2020] DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes [pytorch]
- [CVPR2020] RPM-Net: Robust Point Matching using Learned Features [github]
- [CVPR2020] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
- [CVPR2020] PointGMM: a Neural GMM Network for Point Clouds
- [Arxiv] Dynamic ReLU
- [CVPR2020] SampleNet: Differentiable Point Cloud Sampling [pytorch]
- [Arxiv] Defense-PointNet: Protecting PointNet Against Adversarial Attacks
- [CVPR2020] FPConv: Learning Local Flattening for Point Convolution [pytorch]
- [SIGGRAPH2019] MeshCNN: A Network with an Edge [pytorch]
🔥 ⭐ - [ICCV2019] Total Denoising: Unsupervised Learning of 3D Point Cloud Cleaning [tensorflow]
- [ICCV2019] PU-GAN: a Point Cloud Upsampling Adversarial Network
🔥 - [CVPR2019] Relation-Shape Convolutional Neural Network for Point Cloud Analysis [pytorch]
🔥 - [CVPR2019] Patch-based Progressive 3D Point Set Upsampling
[tensorflow] [pytorch]
🔥 - [TOG2019] Dynamic Graph CNN for Learning on Point Clouds [Project]
🔥 ⭐ - [ECCV2018] EC-Net: an Edge-aware Point set Consolidation Network [project page]
- [CVPR2018] PU-Net: Point Cloud Upsampling Network
⭐ 🔥 - [Arxiv] PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
- [ICLR2017] DEEP LEARNING WITH SETS AND POINT CLOUDS
- [NeurIPS2017] Deep Sets
- [Siggraph2006] Designing with Distance Fields
Others (inc. Networks in Classification, Matching, Registration, Alignment, Depth, Normal, Pose, Keypoints, etc.)
- [Arxiv] Temporally Consistent Online Depth Estimation Using Point-Based Fusion [Project]
- [CVPR2023] Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [Project]
- [Arxiv] Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models [github]
- [Arxiv] Pix2Video: Video Editing using Image Diffusion [Project]
- [Arxiv] Cross-domain Compositing with Pretrained Diffusion Models [Project]
- [Arxiv] 3D-aware Conditional Image Synthesis [Project]
- [CVPR2022] Focal Length and Object Pose Estimation via Render and Compare [github]
- [CVPR2022] Kubric: A scalable dataset generator
Before 2022
- [Arxiv] Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation
- [Arxiv] Toward Practical Self-Supervised Monocular Indoor Depth Estimation
- [Arxiv] PartImageNet: A Large, High-Quality Dataset of Parts [github]
- [Arxiv] AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions
- [Arxiv] Benchmarking Detection Transfer Learning with Vision Transformers
- [Arxiv] Panoptic Segmentation: A Review [github]
- [NeurIPS2021] Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space [github]
- [Arxiv] Attention Mechanisms in Computer Vision: A Survey
- [Arxiv] Leveraging Geometry for Shape Estimation from a Single RGB Image [github]
- [Arxiv] Deep Point Set Resampling via Gradient Fields [github]
- [Arxiv] Efficient 3D Deep LiDAR Odometry [github]
- [NeurIPS2021] 3DP3: 3D Scene Perception via Probabilistic Programming
- [NeurIPS2021] CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration [github]
- [BMVC2021] Cascading Feature Extraction for Fast Point Cloud Registration
- [Arxiv] Pseudo Supervised Monocular Depth Estimation with Teacher-Student Network
- [BMVC2021] Multi-Stream Attention Learning for Monocular Vehicle Velocity and Inter-Vehicle Distance Estimation
- [Arxiv] Occlusion-Robust Object Pose Estimation with Holistic Representation [github]
- [BMVC2021] Depth-only Object Tracking
- [3DV2021] Self-Supervised Monocular Scene Decomposition and Depth Estimation
- [Arxiv] Deep Point Cloud Normal Estimation via Triplet Learning
- [3DV2021] Attention meets Geometry: Geometry Guided Spatial-Temporal Attention for Consistent Self-Supervised Monocular Depth Estimation
- [CORL2021] LENS: Localization enhanced by NeRF synthesis
- [3DV2021] PLNet: Plane and Line Priors for Unsupervised Indoor Depth Estimation [github]
- [Arxiv] Unsupervised Pose-Aware Part Decomposition for 3D Articulated Objects
- [ICCV2021] PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds [Project]
- [ICCV2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation
- [ICCV2021] StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation
- [IROS2021] KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation
- [ICCV2021] Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation [github]
- [Arxiv] Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation [Project]
- [ICCV2021] Deep Hough Voting for Robust Global Registration
- [Arxiv] You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors [Project]
- [ICCV2021] A Robust Loss for Point Cloud Registration
- [Arxiv] Geometry-Aware Self-Training for Unsupervised Domain Adaptationon Object Point Clouds
- [IROS2021] Category-Level 6D Object Pose Estimation via Cascaded Relation and Recurrent Reconstruction Networks [Project] [github]
- [ICCV2021] StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation [github]
- [ICCV2021] SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
- [ICCV2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation
- [ICCV2021] AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds [Project]
- [Arxiv] DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes
- [ICCV2021] Towards Interpretable Deep Networks for Monocular Depth Estimation [github]
- [Arxiv] UPDesc: Unsupervised Point Descriptor Learning for Robust Registration
- [IROS2021] BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models [github]
- [Arxiv] RigNet: Repetitive Image Guided Network for Depth Completion
- [Arxiv] DCL: Differential Contrastive Learning for Geometry-Aware Depth Synthesis
- [ACMMM2021] BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation [Project] [github]
- [Arxiv] Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation
- [ICCV2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration [Project] [pytorch]
- [Arxiv] Score-Based Point Cloud Denoising
- [Arxiv] HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor
- [Arxiv] Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes
- [Arxiv] EdgeConv with Attention Module for Monocular Depth Estimation
- [ICML2021] Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold [Project]
- [ICRA2021] An Adaptive Framework For Learning Unsupervised Depth Completion [github] [github]
- [ICRA2021] TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction [github]
- [Siggraph2021] Orienting Point Clouds with Dipole Propagation
- [CVPR2021] The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth
- [Arxiv] Fully Convolutional Line Parsing [pytorch]
- [CVPR2021] Depth Completion using Plane-Residual Representation
- [Arxiv] Domain Adaptive Monocular Depth Estimation With Semantic Information
- [CVPR2021] Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries [github]
- [Arxiv] Local Metrics for Multi-Object Tracking
- [Arxiv] Full Surround Monodepth from Multiple Cameras
- [CVPR2021] RGB-D Local Implicit Function for Depth Completion of Transparent Objects [Project]
- [CVPR2021] Learning Camera Localization via Dense Scene Matching [pytorch]
- [Arxiv] LSG-CPD: Coherent Point Drift with Local Surface Geometry for Point Cloud Registration
- [ICRA2021] PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN
- [Arxiv] Learning Fine-Grained Segmentation of 3D Shapes without Part Labels
- [CVPR2021] Skeleton Merger: an Unsupervised Aligned Keypoint Detector
- [CVPR2021] Beyond Image to Depth: Improving Depth Prediction using Echoes
- [CVPR2021] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [Project]
- [CVPR2021] Self-supervised Geometric Perception
- [Arxiv] StablePose: Learning 6D Object Poses from Geometrically Stable Patches
- [Arxiv] A Parameterised Quantum Circuit Approach to Point Set Matching
- [Arxiv] Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes
- [Arxiv] Video Transformer Network
- [ICLR2021] NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [pytorch]
- [Arxiv] NBDT: NEURAL-BACKED DECISION TREE [pytorch]
- [Arxiv] AdaBins: Depth Estimation using Adaptive Bins [pytorch]
- [Arxiv] Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes
- [Arxiv] CorrNet3D: Unsupervised End-to-end Learning of Dense Correspondence for 3D Point Clouds
Before 2021
- [NeurIPS2019] PRNet: Self-Supervised Learning for Partial-to-Partial Registration [pytorch]
- [Arxiv] iNeRF: Inverting Neural Radiance Fields for Pose Estimation [Project]
- [Arxiv] Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion
- [Arxiv] 3D Registration for Self-Occluded Objects in Context
- [Arxiv] Continuous Surface Embeddings
- [Arxiv] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
- [Arxiv] MVTN: Multi-View Transformation Network for 3D Shape Recognition
- [Arxiv] PREDATOR: Registration of 3D Point Clouds with Low Overlap
- [Arxiv] Deep Magnification-Arbitrary Upsampling over 3D Point Clouds
- [Arxiv] Occlusion Guided Scene Flow Estimation on 3D Point Clouds
- [NeurIPS2020] An Analysis of SVD for Deep Rotation Estimation
- [EG2020W] SHREC 2020 track: 6D object pose estimation
- [ACCV2020] Best Buddies Registration for Point Clouds
- [3DV] A New Distributional Ranking Loss With Uncertainty: Illustrated in Relative Depth Estimation
- [BMVC2020] View-consistent 4D Light Field Depth Estimation
- [BMVC2020] Neighbourhood-Insensitive Point Cloud Normal Estimation Network [Project]
- [ECCV2020] DeepGMR: Learning Latent Gaussian Mixture Models for Registration [Project]
- [ECCV2020] Motion Capture from Internet Videos [Project]
- [ECCV2020] Depth Completion with RGB Prior
- [ECCV2020] 6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference
- [Arxiv] Self-Supervised Learning of Point Clouds via Orientation Estimation
- [SIGGRAPH2020] SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images [Project]
- [ECCV2020] Learning Stereo from Single Images [github]
- [Arxiv] Learning Long-term Visual Dynamics with Region Proposal Interaction Networks [Project]
- [ECCV2020] Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes [Project]
- [ECCV2020] Unsupervised Shape and Pose Disentanglement for 3D Meshes
- [Arxiv] PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network
- [ECCV2020] P2Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
- [CVPR2020] Learning multiview 3D point cloud registration [pytorch]
- [CVPR2020] Feature-metric Registration: A Fast Semi-supervised Approach for Robust Point Cloud Registration without Correspondences
- [Siggraph2020] Consistent Video Depth Estimation
- [Arxiv] Deep Feature-preserving Normal Estimation for Point Cloud Filtering
- [Arxiv] Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction
- [CVPR2020] Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [pytorch]
- [Arxiv] Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences
- [Arxiv] Adversarial Texture Optimization from RGB-D Scans
- [Arxiv] SAPIEN: A SimulAted Part-based Interactive ENvironment
- [CVPR2020] G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
- [Arxiv] On Localizing a Camera from a Single Image
- [Arxiv] DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
- [CVPR2020] KFNet: Learning Temporal Camera Relocalization using Kalman Filtering
- [Arxiv] Neural Contours: Learning to Draw Lines from 3D Shapes
- [Arxiv] 3dDepthNet: Point Cloud Guided Depth Completion Network for Sparse Depth and Single Color Image
- [Arxiv] Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets
- [CVPR2020] End-to-End Learning Local Multi-view Descriptors for 3D Point Clouds
- [Arxiv] PnP-Net: A hybrid Perspective-n-Point Network
- [CVPR2020] MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision
- [CVPR2020] D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
- [ICIP2020] TRIANGLE-NET: TOWARDS ROBUSTNESS IN POINT CLOUD CLASSIFICATION
- [ICRA2020] Robust 6D Object Pose Estimation by Learning RGB-D Features
- [Arxiv] Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Using Displacement Fields
- [Arxiv] Single Image Depth Estimation Trained via Depth from Defocus Cues [pytorch]
- [Arxiv] DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling
- [Arxiv] Target-less registration of point clouds: A review
- [Arxiv] Quaternion Equivariant Capsule Networks for 3D point clouds
- [Arxiv] Category-Level Articulated Object Pose Estimation
- [Arxiv] A Quantum Computational Approach to Correspondence Problems on Point Sets
- [Arxiv] DeepSFM: Structure From Motion Via Deep Bundle Adjustment
- [Arxiv] P2GNet: Pose-Guided Point Cloud Generating Networks for 6-DoF Object Pose Estimation
- [ICCV2019] Learning Local RGB-to-CAD Correspondences for Object Pose Estimation
- [ICCV2019] Joint Embedding of 3D Scan and CAD Objects [dataset]
- [ICLR2019] BA-NET: DENSE BUNDLE ADJUSTMENT NETWORKS [tensorflow]
- [ICCV2019] GP2C: Geometric Projection Parameter Consensus for Joint 3D Pose and Focal Length Estimation in the Wild
- [ICCV2019] Closed-Form Optimal Two-View Triangulation Based on Angular Errors
- [ICCV2019] Polarimetric Relative Pose Estimation
- [ICCV2019] End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans
- [ICCV2019] Deep Non-Rigid Structure from Motion
- [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
- [Arxiv] Deep Interpretable Non-Rigid Structure from Motion [tensorflow]
- [Arxiv] IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks [dataset]
- [CVPR2019] Scan2CAD: Learning CAD Model Alignment in RGB-D Scans [pytorch]
🔥 - [3DV2019] Location Field Descriptors: Single Image 3D Model Retrieval in the Wild
- [CVPR2016] Marr Revisited: 2D-3D Alignment via Surface Normal Prediction [caffe]
Survey, Resources and Tools
- [Arxiv] Teaching CLIP to Count to Ten
- [Arxiv] ControlNet
- [Arxiv] T2I-Adapter
- [Arxiv] OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation [Project]
- [Arxiv] SDFStudio: A Unified Framework for Surface Reconstruction [Project]
- [Arxiv] Objaverse: A Universe of Annotated 3D Objects [Project]
- [Arxiv] Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild [Project]
- [NeurIPS2021] ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data [github]
- [Dataset] ReplicaCAD [Project]
- [PhDthesis] Synthesizing Photorealistic Images with Deep Generative Learning
- [ICCVW2021] V2X-Sim: A Virtual Collaborative Perception Dataset for Autonomous Driving [Project]
- [Arxiv] TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and Grasping [Project]
- [Arxiv] A Survey of Neural Trojan Attacks and Defenses in Deep Learning
- [Arxiv] Tiny Object Tracking: A Large-scale Dataset and A Baseline [github]
- [Arxiv] A survey of top-down approaches for human pose estimation
- [Arxiv] A Survey on RGB-D Datasets
- [Arxiv] Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks
Before 2022
- [Arxiv] iSeg3D: An Interactive 3D Shape Segmentation Tool
- [Arxiv] Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD) [Project]
- [Arxiv] PandaSet: Advanced Sensor Suite Dataset for Autonomous Driving [Project]
- [Arxiv] Few-Shot Object Detection: A Survey
- [Arxiv] Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping [Project]
- [Arxiv] PyTorchVideo: A Deep Learning Library for Video Understanding [Project]
- [Arxiv] DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes [Project]
- [Arxiv] A Review on Human Pose Estimation
- [ICCV2021] BuildingNet: Learning to Label 3D Buildings [Project]
- [ICCV2021] Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans [Project]
- [Arxiv] Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
- [Arxiv] MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis [Project]
- [Arxiv] UrbanScene3D: A Large Scale Urban Scene Dataset and Simulator [Project]
- [Arxiv] SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving [Project]
- [Arxiv] A Survey on Human-aware Robot Navigation
- [Arxiv] One Million Scenes for Autonomous Driving: ONCE Dataset [Project]
- [Arxiv] 3D Object Detection for Autonomous Driving: A Survey
- [Arxiv] The Oxford Road Boundaries Dataset
- [CVPR2021] 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
- [Arxiv] 3DB: A Framework for Debugging Computer Vision Models [github]
- [Arxiv] NViSII: A Scriptable Tool for Photorealistic Image Generation [github]
- [Dataset] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
- [Survey] 3D Semantic Scene Completion: a Survey
- [Survey] Deep Learning based 3D Segmentation: A Survey
- [Survey] A comprehensive survey on point cloud registration
- [Survey] Domain Generalization: A Survey
- [Dataset] SUM: A Benchmark Dataset of Semantic Urban Meshes
- [Survey] Attention Models for Point Clouds in Deep Learning: A Survey
- [Benchmark] H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [Project]
- [Survey] Dynamic Neural Networks: A Survey
- [Survey] Online Continual Learning in Image Classification: An Empirical Survey
- [Survey] Deep Learning for Visual Tracking: A Comprehensive Survey
- [Survey] Occlusion Handling in Generic Object Detection: A Review
- [Survey] Curriculum Learning: A Survey
- [Github] Awesome Neural Radiance Fields
- [Survey] Neural Volume Rendering: NeRF And Beyond
- [Survey] Transformers in Vision: A Survey
- [Survey] Efficient Transformers: A Survey
- [Survey] Semantics for Robotic Mapping, Perception and Interaction: A Survey
- [Survey] Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy
Before 2021
- [Dataset] The Replica Dataset: A Digital Replica of Indoor Spaces [github]
- [IROS2021] iGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes [Project]
- [Dataset] Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations [Github]
- [Survey] Skeleton-based Approaches based on Machine Vision: A Survey
- [Survey] Deep Learning-Based Human Pose Estimation: A Survey [Github]
- [Dataset] Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding [Github]
- [Survey] A Review and Comparative Study on Probabilistic Object Detection in Autonomous Driving [Github]
- [Dataset] RELLIS-3D Dataset: Data, Benchmarks and Analysis [Github]
- [Arxiv] Motion Prediction on Self-driving Cars: A Review
- [Github] TESSE: Unity-based simulator to enable research in perception, mapping, learning, and robotics
- [Survey] A Survey on Visual Transformer
- [Survey] A Survey on Contrastive Self-supervised Learning
- [Survey] A Survey of Surface Reconstruction from Point Clouds
- [Dataset] Torch-Points3D: A Modular Multi-Task Framework for Reproducible Deep Learning on 3D Point Clouds [Project]
- [Thesis] Learning to Reconstruct and Segment 3D Objects
- [Survey] An Overview Of 3D Object Detection
- [Survey] A Brief Review of Domain Adaptation
- [Dataset] Announcing the Objectron Dataset
- [Tutorial] Video Action Understanding: A Tutorial
- [Arxiv] Fusion 360 Gallery: A Dataset and Environment for Programmatic CAD Reconstruction [Page]
- [Survey] Multi-Task Learning with Deep Neural Networks: A Survey
- [Survey] Deep Learning for 3D Point Cloud Understanding: A Survey
- [Thesis] COMPUTATIONAL ANALYSIS OF DEFORMABLE MANIFOLDS: FROM GEOMETRIC MODELING TO DEEP LEARNING
- [Arxiv] F*: An Interpretable Transformation of the F-measure
- [Dataset] Gibson Database of 3D Spaces
- [BMVC2020] Black Magic in Deep Learning: How Human Skill Impacts Network Training
- [Arxiv] PyTorch Metric Learning
- [Arxiv] RGB-D Salient Object Detection: A Survey [Project]
- [Arxiv] AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification [Project]
- [CVPR2020] OASIS: A Large-Scale Dataset for Single Image 3D in the Wild [Project]
- [Arxiv] 3D-FUTURE: 3D FUrniture shape with TextURE
- [Arxiv] 3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics [Project][Link]
- [Arxiv] Differentiable Rendering: A Survey
- [Arxiv] Visual Relationship Detection using Scene Graphs: A Survey
- [Arxiv] Polarization Human Shape and Pose Dataset
- [Arxiv] IDDA: a large-scale multi-domain dataset for autonomous driving [Project page]
- [CVPR2020] RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [Project page]
- [EG2020] State of the Art on Neural Rendering
- [IJCAI-PRICAI2020] 3D-FUTURE: 3D FUrniture shape with TextURE
- [Arxiv] Toronto-3D: A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways
- [Arxiv] KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations
- [Arxiv] A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications
- [Arxiv] From Seeing to Moving: A Survey on Learning for Visual Indoor Navigation (VIN)
- [Arxiv] DIODE: A Dense Indoor and Outdoor DEpth Dataset [dataset]
- [Github] Various GANs with Pytorch.
- [Arxiv] SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances [dataset]
- [CVM] A Survey on Deep Geometry Learning: From a Representation Perspective
- [Arxiv] A survey on Semi-, Self- and Unsupervised Techniques in Image Classification
- [Arxiv] fastai: A Layered API for Deep Learning
- [Arxiv] AU-AIR: A Multi-modal Unmanned Aerial Vehicle Dataset for Low Altitude Traffic Surveillance [dataset]
- [Arxiv] VIRTUAL KITTI 2 [dataset]
- [Arxiv] Tutorial on Variational Autoencoders
- [Arxiv] Review: deep learning on 3D point clouds
- [Arxiv] Image Segmentation Using Deep Learning: A Survey
- [CVPR2018] Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction
- [Arxiv] Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey
- [Arxiv] MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection
- [Arxiv] Deep Learning for 3D Point Clouds: A Survey
- [Arxiv] A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
- [Arxiv] A Survey on Deep Learning Architectures for Image-based Depth Reconstruction
- [Arxiv] secml: A Python Library for Secure and Explainable Machine Learning
- [Arxiv] Bundle Adjustment Revisited
- [ICCV2019] Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement
- [Arxiv] SIFT Meets CNN: A Decade Survey of Instance Retrieval
- [ICCV2019] Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data [tensorflow]
- [Arxiv] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks [dataset]
- [Arxiv] Imbalance Problems in Object Detection: A Review [repository]
- [IJCV] Deep Learning for Generic Object Detection: A Survey
- [Arxiv] Differentiable Visual Computing (Ph.D thesis)
- [BMVC2018] InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset [dataset]
- [ICCV2017] The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes [dataset] [script]
⭐ - [Arxiv] SynthCity: A large scale synthetic point cloud [dataset]
- [Github] Mesh Voxelization (SDFs or Occupancy grids)
- [Github] SDFGen (to generate grid-based signed distance field (level set))
- [Github] Blender renderer for python
- [Github] Blender renderer for python
- [Github] Volumetric TSDF Fusion of RGB-D Images in Python
- [Github] Volumetric TSDF Fusion of Multiple Depth Maps
- [Github] PyFusion
- [Github] PyRender
- [Github] PyMCubes
- [Github] Watertight and Simplified Meshes through TSDF Fusion (Python tool for obtaining watertight meshes using TSDF fusion.)
- [Github] Several tools about SDF functions.
- [Github] 3DMatch Toolbox
- [stackoverflow] Computing truncated signed distance function(TSDF) from a point cloud
- [Github] voxblox: A library for flexible voxel-based mapping, mainly focusing on truncated and Euclidean signed distance fields.
- [Github] Discregrid: A static C++ library for the generation of discrete functions on a box-shaped domain. This is especially suited for the generation of signed distance fields.
- [Github] awesome-voxel: Voxel resources for coders
- [Github] gvdb-voxels: Sparse volume compute and rendering on NVIDIA GPUs
- [Github] pyntcloud is a Python library for working with 3D point clouds.
- [Github] Open3D: A Modern Library for 3D Data Processing
- [Github] mesh_to_sdf: Calculate signed distance fields for arbitrary meshes
- [Github] Detecting & Penalizing Mesh Intersections
- [CVPR2021] Picasso: A CUDA-based Library for Deep Learning over 3D Meshes [Github]
- [Github] A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications
- [Arxiv] Shuffler: A Large Scale Data Management Tool for Machine Learning in Computer Vision
- [Arxiv] PyGAD: An Intuitive Genetic Algorithm Python Library [Github]
- [Arxiv] PyGAD: An Intuitive Genetic Algorithm Python Library [Github]
- [ICRA2014] A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM [Project]
- [CVPR2016] SceneNet: Understanding Real World Indoor Scenes With Synthetic Data [Project]