Stability AI Launches Stable Audio

Stability AI, the company behind the AI-driven art generator Stable Diffusion, has unveiled a new offering: Stable Audio Open. This open AI model specializes in generating sounds and songs and boasts exclusive training on royalty-free recordings.

Stable Audio Open functions by taking a textual description (e.g., “Rock beat played in a treated studio, session drumming on an acoustic kit”) and producing a recording of up to 47 seconds. Its training data draws from approximately 486,000 samples sourced from freely available music libraries like Freesound and the Free Music Archive.

According to Stability AI, this model can craft drum beats, instrument riffs, ambient sounds, and various production elements suitable for videos, films, and TV shows. Additionally, it can be utilized to “edit” existing songs or infuse the style of one song (such as smooth jazz) into another.

Stability AI wrote in a post on its corporate blog, “A key benefit of this open source release is that users can fine-tune the model on their own custom audio data, For example, a drummer could fine-tune on samples of their own drum recordings to generate new beats.”

Stable Audio Open has certain limitations. It cannot generate complete songs, melodies, or vocals—at least not at a satisfactory level. Stability AI clarifies that the model isn’t designed for such tasks and recommends interested users explore their premium service, Stable Audio, for these capabilities.

Moreover, Stable Audio Open is not intended for commercial use; its terms of service explicitly forbid it. Additionally, its performance may vary across different musical styles, cultures, and languages other than English. Stability AI attributes these biases to the limitations of the training data used.

Stability AI wrote in a description of the model, “The source of data is potentially lacking diversity and all cultures are not equally represented in the data set, the generated samples from the model will reflect the biases from the training data.”

Stability AI has been facing challenges in revitalizing its struggling business and recently became controversial. Ed Newton-Rex, the company’s VP of generative audio, resigned due to disagreements over the company’s stance that training generative AI models on copyrighted works constitutes “fair use.” The release of Stable Audio Open appears to be an effort to shift the narrative while subtly promoting Stability AI’s paid products.

As music generators, including those from Stability AI, gain popularity, copyright concerns are becoming increasingly prominent. A key issue is the potential misuse of copyrighted material in training these models.

In May, Sony Music, representing artists like Billy Joel, Doja Cat, and Lil Nas X, sent a letter to 700 AI companies warning against its content’s “unauthorized use” for training audio generators. Furthermore, in March, Tennessee enacted the U.S.’s first law to curb AI abuses in music.