AI company Stability AI unveiled the Stable Diffusion Medium model, a smaller yet powerful AI designed to generate images from text. This model adds value to the earlier version, the Stable Diffusion 3, known for its superior image generation capabilities.
Targeting Users with Limited Resources
The Stable Diffusion Medium targets end users and organizations with limited resources, offering them a high-performance image generation system. While it has a current API availability via the Stable Artisan server on Discord, the model’s weights are also accessible for non-commercial use on Hugging Face.
Differentiating from the Previous Model
The previous version, rebranded as Stable Diffusion 3 (SD3) Large, boasts 8 billion parameters. The newer SD3 Medium, however, operates with just 2 billion parameters and is compatible with consumer-grade graphics cards, necessitating a mere 5GB of video memory. Previously, only advanced Nvidia models could run Stable Diffusion models. The company, however, still recommends a video card with 16GB memory for optimum performance.
Features and Capabilities
Despite its relatively low-resource requisites, the SD3 Medium offers comprehensive functionality. It features photorealism, typography, natural language understanding, spatial sensitivity in image composition, high adaptability, and the ability for fine-tuning. These capabilities are comparable to the bigger and older model, the SD3 Large.