An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering
Papers with code | Paper
University of Novi Sad  University of Cambridge
Citation
Besides AIAI 2021, our paper is in a Springer's book entitled "Artificial Intelligence Applications and Innovations": link
Please, cite our paper if you find this code useful for your research.
@article{zubic2021effective,
title={An Effective Loss Function for Generating 3D Models from Single 2D Image without Rendering},
author={Zubi{\'c}, Nikola and Li{\`o}, Pietro},
journal={arXiv preprint arXiv:2103.03390},
year={2021}
}
Prerequisites
-
Download code:
Git clone the code with the following command:git clone https://github.com/NikolaZubic/2dimageto3dmodel.git
-
Open the project with Conda Environment (Python 3.7)
-
Install packages:
conda install pytorch torchvision torchaudio cudatoolkit=11.0 -c pytorch
Then git clone Kaolin library in the root (2dimageto3dmodel) folder with the following commit and run the following commands:
cd kaolin git checkout e7e513173b python setup.py install pip install --no-dependencies nuscenes-devkit opencv-python-headless scikit-learn joblib pyquaternion cachetools pip install packaging
Run the program
Run the following commands from the root/code/ (2dimageto3dmodel/code/) directory:
python main.py --dataset cub --batch_size 16 --weights pretrained_weights_cub --save_results
for the CUB Birds Dataset.
python main.py --dataset p3d --batch_size 16 --weights pretrained_weights_p3d --save_results
for the Pascal 3D+ Dataset.
The results will be saved at 2dimageto3dmodel/code/results/
path.
Continue training
To continue the training process:
Run the following commands (without --save_results
) from the root/code/ (2dimageto3dmodel/code/) directory:
python main.py --dataset cub --batch_size 16 --weights pretrained_weights_cub
for the CUB Birds Dataset.
python main.py --dataset p3d --batch_size 16 --weights pretrained_weights_p3d
for the Pascal 3D+ Dataset.
Generation of Pseudo-ground-truths
In these reconstruction steps, we need a trained mesh estimation model. We can use the pre-trained model (already provided) or train it from scratch. The Pseudo-ground-truth data for CUB birds is generated in the following way:
python run_reconstruction.py --name pretrained_reconstruction_cub --dataset cub --batch_size 10 --generate_pseudogt
For Pascal 3D+ dataset:
python run_reconstruction.py --name pretrained_reconstruction_p3d --dataset p3d --optimize_z0 --batch_size 10 --generate_pseudogt
Through this, we replace a cache directory, which contains pre-computed statistics for the evaluation of Frechet Inception Distances, poses and images metadata, and the Pseudo-ground-truths for each image.
Mesh generator training from scratch
Set up the Pseudo-ground-truth data as described in the section above, then execute the following command:
python main.py --name cub_512x512_class --conditional_class --dataset cub --gpu_ids 0,1,2,3 --batch_size 32 --epochs 1000 --tensorboard
Here, we train a CUB birds model, conditioned on class labels, for 1000 epochs. Every 20 epochs, we have FID evaluations (which can be changed with --evaluate_freq
). Usage of different numbers of GPUs can produce slightly different results. Tensorboard allows us to export the results in Tensorboard's log directory tensorboard_gan
.
After training, we can find the best model's checkpoint with the following command:
python main.py --name cub_512x512_class --conditional_class --dataset cub --gpu_ids 0,1,2,3 --batch_size 64 --evaluate --which_epoch best
Mesh estimation model training
Use the following two commands for training from scratch:
python run_reconstruction.py --name pretrained_reconstruction_cub --dataset cub --batch_size 50 --tensorboard
python run_reconstruction.py --name pretrained_reconstruction_p3d --dataset p3d --optimize_z0 --batch_size 50 --tensorboard
Tensorboard log files are saved in tensorboard_recon
.
License
MIT
Acknowledgment
This idea has been built based on the architecture of Insafutdinov & Dosovitskiy.
Poisson Surface Reconstruction was used for Point Cloud to 3D Mesh transformation.
The GAN architecture (used for texture mapping) is a mixture of Xian's TextureGAN and Li's GAN.