Original model(Sabrewulf)
Cobalt metal
Titanium metal
African chameleon
Ant
Bald eagle
Bighorn
Chromium metal
Cock
Orchid plant
Shark
fire salamander
Original model
(Kinni - character)
Aluminum metal
Egyptian cat
Theater building
Gold
Tomato
Lily plant
Indigo bunting
Broccoli
Sycamore tree
Triceratops
Orchid plant
Original model (Apatosaurus)
Original model
(Devil)
Original model (Nesting doll)
Original model
(Tiger)
Banana
Kit fox
Cock
Giraffe
Gold
Gazelle
Flamingo
African chameleon
King penguin
Polar bear
Peacock
King penguin
In this paper, we tackle a new task of 3D object synthesis, where a 3D model is combined with another object text to create a novel 3D model. However, most existing text/image/3D-to-3D methods struggle to effectively integrate multiple content sources, often resulting in inconsistent textures and inaccurate shapes. To overcome these challenges, we propose a straightforward yet powerful approach, Text+3D-to-3D (T33D), for generating novel and compelling 3D models. Our method begins by rendering multi-view images and normal maps from the input 3D model, then generating a novel, surprising 2D object using ATIH with the front-view image and the another object text as inputs. To ensure texture consistency, we introduce texture multi-view diffusion, which refines the textures of the remaining multi-view RGB images based on the novel 2D object. For enhanced shape accuracy, we propose shape multi-view diffusion to improve the 2D shapes of both the multi-view RGB images and the normal maps, also conditioned on the novel 2D object. Finally, these outputs are used to reconstruct a complete and novel 3D model. Extensive experiments demonstrate the effectiveness of our method, yielding impressive 3D creations.
Our TMDiff achieves better texture consistency compared to ATIH
Our SMDiff demonstrates better shape accuracy compared to Era3D
We observe that Era3D, CRM and LGM, and Vfusion3D
struggle with inconsistent textures and inaccurate shapes in the generated 3D object models. In contrast, our method successfully
synthesizes novel 3D objects, such as the Cat-Dock and Tiger-Egyptian Cat, shown in the first and fourth columns, respectively.
We observe that ThemeStation produces 3D object models with inconsistent textures and inaccurate shapes, whereas our method successfully generates coherent and novel 3D object syntheses.