What is AudioCraft, the new AI tool under Meta?

Meta has released a new music generator, AudioCraft, which uses artificial intelligence to create music or sound effects.

Photo: Unsplash

AudioCraft is an open source program that creates effects and music from text prompts, similar to what AI image or video generators do. AudioCraft has three models available:

MusicGen for composing music
AudioGen for creating sound effects
EnCodec to help with audio compression

MusicGen was previously known among music creators and AI hobbyists. But now Meta has revealed the code for this model, which allows users to enrich it with their own music data. Understandably, ethical as well as legal questions immediately arose, as most AI music works were immediately reported by music publishers as infringing intellectual property.

Video: Meta

Meta specifically stated that it only created the default model based on company-owned music and their licensed music. Specifically, they used 20,000 hours of audio and 400,000 recordings along with text descriptions and metadata, all under the umbrella of the Meta Music Initiative Sound Collection, Shutterstock and Pond5 platforms. They also removed all the vocals before the release, to prevent imitation of the creators' voices.

The second model, AudioGen, is dedicated to creating ambient sounds and sound effects. AudioGen is a diffusion-based model, like most modern image generators (DALL-E 2, Stable Diffusion...). In diffusion, the model learns how to incrementally de-noise initial data that is entirely noise—such as audio or images—and moves it step by step closer to the target prompt.

In addition to effects, AudioGen was also created to generate speech, which Meta admits could be misused by some to spoof voices. Despite the concerns, at least for now, they have not placed specific restrictions on the various ways AudioCraft can be used.

The third model, EnCodec, is an improvement on Meta's previous model to create music with fewer artifacts. Meta claims to more efficiently model audio sequences and capture different levels of information when training data audio waveforms to help create new audio.

Meta envisioned AudioCraft as a tool for musicians and creators who could create new compositions without having to physically play instruments. They also targeted developers with a more limited budget, who could use AudioCraft to create different sounds for virtual worlds, and Instagram/TikTok creators, for example, could create the most appropriate sounds for their posts.

At least for now, AudioCraft's license does not allow commercial use.

How to install and test AudioCraft AI tool?

The code is on Github, and you have several options for installation. You can use the Pinokio program (https://pinokio.computer) which will more or less automatically install the AI music tool for you. You need to select the AudioGradio module from their library, install it (takes a few minutes) and you'll end up with a local IP to test AudioCraft with.

Other methods require pre-installed Python, Pip, Anaconda, minicondo or similar programs. A good and easy-to-understand guide was published on GitHub (https://bit.ly/GHglasba) by the user mberman84 and applies to the miniconda program. The end result is the same. You will get an IP that you enter into your browser and you can start experimenting.