Search results
Results from the WOW.Com Content Network
Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing artificial intelligence boom.
An image conditioned on the prompt an astronaut riding a horse, by Hiroshige, generated by Stable Diffusion 3.5, a large-scale text-to-image model first released in 2022. A text-to-image model is a machine learning model which takes an input natural language description and produces an image matching that description.
AUTOMATIC1111 Stable Diffusion Web UI (SD WebUI, A1111, or Automatic1111 [3]) is an open source generative artificial intelligence program that allows users to generate images from a text prompt. [4] It uses Stable Diffusion as the base model for its image capabilities together with a large set of extensions and features to customize its output.
Stable Diffusion 3 (2024-03) [65] changed the latent diffusion model from the UNet to a Transformer model, and so it is a DiT. It uses rectified flow. It uses rectified flow. Stable Video 4D (2024-07) [ 66 ] is a latent diffusion model for videos of 3D objects.
As of August 2023, more than 15 billion images had been generated using text-to-image algorithms, with 80% of these created by models based on Stable Diffusion. [170] If AI-generated content is included in new data crawls from the Internet for additional training of AI models, defects in the resulting models may occur. [171]
[20] [21] Users retained the ownership of resulting output regardless of models used. [22] [23] The models can be used either online or locally by using generative AI user interfaces such as ComfyUI and Stable Diffusion WebUI Forge (a fork of Automatic1111 WebUI). [8] [24] An improved flagship model, Flux 1.1 Pro was released on 2 October 2024.
In August 2022, the company co-released an improved version of their Latent Diffusion Model called Stable Diffusion together with the CompVis Group at Ludwig Maximilian University of Munich and a compute donation by Stability AI. [14] [15] On December 21, 2022 Runway raised US$50 million [16] in a Series C round.
The LDM is an improvement on standard DM by performing diffusion modeling in a latent space, and by allowing self-attention and cross-attention conditioning. LDMs are widely used in practical diffusion models. For instance, Stable Diffusion versions 1.1 to 2.1 were based on the LDM architecture. [4]