Text2Tex: Text-driven Texture Synthesis via Diffusion Models
Introduction
We present Text2Tex, a novel method for generating high-quality textures for 3D meshes from the given text prompts. Our method incorporates inpainting into a pre-trained depth-aware image diffusion model to progressively synthesize high resolution partial textures from multiple viewpoints. To avoid accumulating inconsistent and stretched artifacts across views, we dynamically segment the rendered view into a generation mask, which represents the generation status of each visible texel. This partitioned view representation guides the depth-aware inpainting model to generate and update partial textures for the corresponding regions. Furthermore, we propose an automatic view sequence generation scheme to determine the next best view for updating the partial texture. Extensive experiments demonstrate that our method significantly outperforms the existing text-driven approaches and GAN-based methods.
Please also check out the project website here.
For additional detail, please see the Text2Tex paper:
"Text2Tex: Text-driven Texture Synthesis via Diffusion Models"
by Dave Zhenyu Chen, Yawar Siddiqui,
Hsin-Ying Lee,
Sergey Tulyakov, and Matthias Nieรner
from Technical University of Munich and Snap Research.