Effort Seeks to Democratize Text-to-Image Generation


Without the heavy processing load of past models, this new image generation model creates images in seconds running on consumer GPUs.

Stability AI has released stable diffusion for both researchers and the public. This text-to-image generation model can run on consumer GPUs and creates images at 512×512 pixels in seconds.

The model greatly speeds up image generation without the heavy processing load of past models. It is housed under a Creative ML OpenRAIL-M license that allows for commercial and non-commercial usage. The software package also includes a safety classifier so that users can remove unnecessary or undesirable outputs.

Both researchers and commercial users are encouraged to provide feedback on the image model and to note discrepancies between inputs and the final images. The organization notes that the models were trained on image-text pairs from a broad internet scrape and may still result in some biases. With feedback, they’re confident they can improve the model to reduce and even eliminate such biases.

See also: OpenAI Launches API For Text-to-Image Generator DALL-E

The team plans for future datasets to expand generation options

The release will also lay the foundation for future datasets and projects expected to come out at a later date. The output will also provide the basis for an open synthetic dataset for research. The team will continue to share updates as they refine the new models and are still accepting benchmark collaborators to work through any further kinks and to refine output.

The ultimate goal is to reduce the processing required to build models and enable more developers to leverage image generation for various projects. Patrick Esser from Runway and Robin Rombach from the Machine Vision & Learning research group at LMU Munich (formerly CompVis lab at Heidelberg University) led the way to the release, building on their prior work on Latent Diffusion Models at CVPR’22. In addition, communities at Eleuther AI, LAION, and Stability AI’s generative AI team offered full support.

The team has released support materials, including the model card and a public demonstration space.

Elizabeth Wallace

About Elizabeth Wallace

Elizabeth Wallace is a Nashville-based freelance writer with a soft spot for data science and AI and a background in linguistics. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do.

Leave a Reply

Your email address will not be published. Required fields are marked *