Stability AI Launches ‘Stable Zero123′ Model that Can Generate 3D Objects from a Single Image
In Brief
Stability AI announced Stable Zero123, a generative AI model that can create 3D images from regular pictures.
Stability AI — the startup behind the image-generating AI system Stable Diffusion, introduced its latest innovation – Stable Zero123. The generative AI model trained in-house, can create 3D images from regular pictures with enhanced quality and efficiency.
According to the company, the newly launched model marks an improvement over its predecessors Zero1-to-3 and Zero123-XL, due to advanced training datasets and techniques. Unlike its counterparts, Stable Zero123 demonstrates a deep understanding of objects, producing novel views with quality from various angles.
The company’s blog post says that Stable Zero123 is based on Stable Diffusion 1.5, utilizing the same amount of Video Random Access Memory (VRAM) to generate one novel view. However, Stability AI made it clear that generating 3D objects with this model demands more time and memory, recommending a substantial 24GB VRAM for optimal performance.
An important point to note from the announcement is that the model has been made available only for non-commercial and research purposes, as the company aims to promote innovation within the scientific community.
The company announced that researchers and enthusiasts can now access Stable Zero123 on Hugging Face, facilitating experimentation and exploration of its capabilities.
Setting New Standards in 3D image generation
With Stable Zero123, Stability AI aims to advance the field of computer-generated imagery, providing researchers with a tool to explore the possibilities of 3D image generation. To that end, it has enhanced the training dataset for Stabile Zero123. It utilizes a filtered training dataset sourced from Objaverse, focusing solely on preserving high-quality 3D objects.
The company rendered these objects with realism, to surpass previous methods.
During both training and inference, the generative AI model benefits from elevation conditioning. By providing the model with an estimated camera angle, it makes more informed and higher-quality predictions, resulting in superior visual outcomes.
Additionally, the incorporation of a pre-computed dataset (pre-computed latents) and an improved data loader, combined with the first two, led to a 40 times speed-up in training efficiency compared to its predecessor, Zero123-XL.
To encourage open research in 3D object generation, Stability AI has improved the open-source code of the threestudio project to support Zero123 and Stable Zero123. A simplified version of the Stable 3D process is currently in private preview, utilizing Score Distillation Sampling (SDS) to optimize a Neural Radiance Field (NeRF) using Stable Zero123.
However, it is not intended for commercial use. The company emphasized this to be release exclusive for research purposes.
Disclaimer
In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.About The Author
Kumar is an experienced Tech Journalist with a specialization in the dynamic intersections of AI/ML, marketing technology, and emerging fields such as crypto, blockchain, and NFTs. With over 3 years of experience in the industry, Kumar has established a proven track record in crafting compelling narratives, conducting insightful interviews, and delivering comprehensive insights. Kumar's expertise lies in producing high-impact content, including articles, reports, and research publications for prominent industry platforms. With a unique skill set that combines technical knowledge and storytelling, Kumar excels at communicating complex technological concepts to diverse audiences in a clear and engaging manner.
More articlesKumar is an experienced Tech Journalist with a specialization in the dynamic intersections of AI/ML, marketing technology, and emerging fields such as crypto, blockchain, and NFTs. With over 3 years of experience in the industry, Kumar has established a proven track record in crafting compelling narratives, conducting insightful interviews, and delivering comprehensive insights. Kumar's expertise lies in producing high-impact content, including articles, reports, and research publications for prominent industry platforms. With a unique skill set that combines technical knowledge and storytelling, Kumar excels at communicating complex technological concepts to diverse audiences in a clear and engaging manner.