Google’s impressive new AI system can generate any genre of music given a text description. However, the company fears the risks and has no immediate plans to release it.
Google’s, called MusicLM, is certainly not the first AI system to generate songs. Other attempts include AI Riffusion, which visualizes and composes music, Google’s own AudioML, and Dance Diffusion, his Jukebox in OpenAI. However, due to technical limitations and limited training data, no one has been able to create particularly complex compositions or high-fidelity songs.
MusicLM is probably the first possibility.
As detailed in an academic paper, MusicLM was trained on a dataset of 280,000 hours of music and, as its creators say, “extremely complex” (e.g., “impressive saxophone solos and captivating jazz Song and Solo Singer” or “Berlin 90’s techno with a lot of bass and a lot of kick.” It’s not really organized.
How hard to overestimate good If there is no musician or instrumentalist in the loop, the sample will sound. MusicLM is able to capture the nuances of instrumental riffs, melodies and moods, even when given rather long and winding descriptions.
For example, the caption for the sample below included the bit, “Induce the experience of getting lost in space.” This certainly sounds that way (at least to my ears).
Here’s another sample generated from a description that begins with the sentence “arcade game main soundtrack”. Sounds plausible, right?
MusicLM does more than generate short clips of songs. Researchers at Google have shown that systems can be built on existing melodies, such as humming, singing, whistling, or playing on an instrument. In addition, MusicLM takes several descriptions written in sequence (e.g., ‘time to meditate’, ‘time to wake up’, ‘time to run’, ‘time to give 100%’) to create a kind of melodic You can create a “story” or story. It’s several minutes long and perfect for movie soundtracks.
Please refer to the following. This is from the sequences “electronic music played in a video game”, “meditation song played by the river”, “fire” and “fireworks”.
That’s not to suggest that MusicLM is perfect. Some samples have skewed quality, an inevitable side effect of the training process. Also, MusicLM can technically produce vocals with choral harmonies, but most of the time much is desired. Most of the “lyrics” are sung by synthesized voices that sound like an amalgamation of several artists, and range from mostly incoherent to pure gibberish.
Still, Google researchers note the many ethical challenges posed by systems like MusicLM. This includes the unfortunate trend of incorporating copyrighted material into songs generated from training data. During their experiments, they found that about 1% of the music the system generated was directly cloned from the songs the system trained. This is clearly enough of a threshold to discourage releasing MusicLM in its current state.
“We are aware of the potential misuse risks of creative content associated with our use cases,” wrote the paper’s co-authors. and strongly emphasizes the need for more future work.”
Assuming that MusicLM or any such system will one day become available, it seems inevitable that significant legal issues will surface. They already have, albeit around simpler AI systems. In 2020, Jay-Z records his label’s YouTube channel, claiming he used AI to create covers of songs such as Jay-Z’s “We Didn’t Start the Fire” by Billy Joel. I filed a copyright infringement notice against Vocal Synthesis. After initially removing the video, YouTube put it back and found the removal request to be “incomplete.” However, the legal basis for deepfake music is still ambiguous.
A white paper authored by Eric Sunray, now a legal intern at the Music Publishers Association, states that AI music generators like MusicLM “create a coherent tapestry of audio from the works captured during training, thereby creating a tapestry of U.S. copyright.” By violating the law of copying, you are infringing the copyright of your music.” correct. “After Jukebox’s release, critics have also questioned whether training AI models on copyrighted musical material is fair use. Generate images, code, and text. Similar concerns have been raised about the training data used by AI systems that do it, often scraped from the web without the knowledge of their creators.
From a user’s perspective, Andy Baio of Waxy speculates that music generated by AI systems is considered a derivative work, in which case only the original elements are protected by copyright. Of course, it is unclear what would be considered “original” in such music. To use this music commercially is to enter uncharted waters. When the generated music is used for purposes protected by fair use, such as parody or commentary, the issue is simpler, but Baio said courts would have to decide on a case-by-case basis. I expect.
It may not be long before the problem becomes clear. Several lawsuits going through court could affect AI that generates music. This includes the rights of artists whose work is used to train AI systems without their knowledge or consent.