Zero-1-to-3: Zero-shot One Image to 3D Object
A HuggingFace Diffusers implementation of Zero123.
Merged into Diffusers Repo here.
Usage
Pytorch 2.0 for faster training and inference.
conda create -f environment.yml
or
conda create -n zero123-hf python=3.9
conda activate zero123-hf
pip install -r requirements.txt
Install xformer properly to enable efficient transformers.
conda install xformers -c xformers
# from source
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
Run diffusers pipeline demo:
python test_zero1to3.py
Run our gradio demo for novel view synthesis:
python gradio_new.py
Training
Download Zero123's Objaverse Renderings data:
wget https://tri-ml-public.s3.amazonaws.com/datasets/views_release.tar.gz
Configure accelerator by
accelerate config
Launch training:
Follow Original Zero123, fp32, gradient checkpointing, and EMA are turned on.
accelerate launch train_zero1to3.py \
--train_data_dir /data/zero123/views_release \
--pretrained_model_name_or_path lambdalabs/sd-image-variations-diffusers \
--train_batch_size 192 \
--dataloader_num_workers 16 \
--output_dir logs \
--use_ema \
--gradient_checkpointing \
--mixed_precision no
While bf16/fp16 is also supported by running below
accelerate launch train_zero1to3.py \
--train_data_dir /data/zero123/views_release \
--pretrained_model_name_or_path lambdalabs/sd-image-variations-diffusers \
--train_batch_size 192 \
--dataloader_num_workers 16 \
--output_dir logs \
--use_ema \
--gradient_checkpointing \
--mixed_precision bf16
For monitoring training progress, we recommand wandb for its simplicity and powerful features.
wandb login
Acknowledgement
This repository is based on original Zero1to3 and popular HuggingFace diffusion framework diffusers.
Citation
If you find this work useful, a citation will be appreciated via:
@misc{zero123-hf,
Author = {Xin Kong},
Year = {2023},
Note = {https://github.com/kxhit/zero123-hf},
Title = {Zero123-hf: a diffusers implementation of zero123}
}
@misc{liu2023zero1to3,
title={Zero-1-to-3: Zero-shot One Image to 3D Object},
author={Ruoshi Liu and Rundi Wu and Basile Van Hoorick and Pavel Tokmakov and Sergey Zakharov and Carl Vondrick},
year={2023},
eprint={2303.11328},
archivePrefix={arXiv},
primaryClass={cs.CV}
}