Artificial intelligence is shaping the future of music — but at what cost?
The technology is becoming ubiquitous, and it's raising serious social, ethical and political concerns
By Sarah Mackenzie, writer and video producer based in Montreal, telling stories at the intersection of music and technology.
Robot DJs. Algorithmically generated death metal. Scientifically engineered soundscapes for babies. Virtual song contests featuring unprecedented collaborations between humans and machines. Welcome to the uncanny world of music and artificial intelligence (AI), a macrocosm of melodious experiments.
Whether we know it or not, AI has become part of daily life. As soon as you ask Siri to find the nearest coffee shop or let Netflix decide what to watch next, AI has entered the decision-making process. In music, its most common application lies within music discovery. Streaming platforms like Spotify use AI through a predictive recommendation engine that spits out personalized playlists based on a user's listening habits. But that's just scratching the surface.
AI — a broad area of computer science that enables machines to impersonate human learning and problem-solving skills — has recently experienced rapid technological advancement, triggering an increase in music production tools that allow a user to easily create beats and write lyrics, with little previous musical experience. These platforms are trained to analyze millions of songs, and through an AI process known as machine learning, patterns of melody and harmony are internalized to create new music of a similar style. Silicon Valley initiatives like Google's Magenta and OpenAI's Jukebox are racing to develop the most powerful iterations of these models. Major labels are also taking the plunge: Warner Music Group is mining streaming data with its AI-driven A&R tool Sodatone to find fresh artists to sign to their roster.
While AI can allow for a more accessible and democratic music-creation process, critics have raised red flags about how these new forms of technology undermine human labour and creativity. The increasingly intensive use of AI escalates a variety of ethical, political and social concerns. Through this lens we're left to wonder: what should be the role of AI within music?
Racism and colonialism within artificial intelligence
In Montreal, transdisciplinary artist Rouzbeh Shadpey produces immersive sound projects using an AI platform that has cloned his voice. In 2017, Shadpey boldly departed from his career in medicine (in the middle of a psychiatric residency) to pursue music. As an Iranian-Canadian (his parents immigrated to Montreal shortly after he was born), he believes that the relationship between coloniality and decoloniality is an extremely important thread to be addressed within the arts. According to Shadpey, dismantling colonial infrastructure within artistic institutions is imperative to creating space for artists of colour to thrive.
"Our work as artists is far removed from the concrete day-to-day realities that colonial violence poses for racialized and Indigenous peoples," he said.
Shadpey's first experience with AI took place when he was studying electroacoustics at Concordia University, an academic environment he experienced as "white and patriarchal." Shadpey, who identifies as a cisgender queer man and has experienced xenophobia throughout his life, felt like an outsider. Not knowing how to code — a skill often required to manipulate AI tools — Shadpey was attracted to vocal cloning platforms, specifically Montreal's Lyrebird, which has a shallow learning curve. After submitting a reading of a short text, his cloned voice was born. He wrote scripts for his vocal avatar to recite and integrated the sound files into his work, which was first presented as a multi-channel performance titled, The Voice, Suspended Between Us.
We need to decolonize our mindset on what the voice is. By doing this, ironically, more people will be heard, and more people will actually get voices.-Rouzbeh Shadpey
In 2019, Shadpey collaborated with LGBTQ2I+ community-generated platform Queering the Map, a digital mapping initiative that archives and geo-locates queer memories. Montreal designer and researcher Lucas LaRochelle, Queering the Map's creator, launched an AI model called QT.bot once the project was launched. With 82,000 text entries from the Queering the Map platform, the AI was trained to generate speculative queer futures and the environments in which they occur. For QT.bot's first output, Shadpey provided sound design for a karaoke-style video reel called "Sitting Here With you in the Future." He has also released electronic music under his GOLPESAR moniker for labels like Opal Tapes.
Shadpey is currently readying a spoken-word album recorded through his vocal avatar. The process of duplicating his voice with AI worries Shadpey, as he is providing his data to tech platforms. There's a chance his synthetic voice will be used without his knowledge and consent. Despite this possibility, he sees the exercise as an opportunity to challenge the status quo.
"We need to decolonize our mindset on what the voice is," he emphasized. "By doing this, ironically, more people will be heard, and more people will actually get voices."
AI music models often fail to look beyond homogeneous Silicon Valley startup culture. A lack of demographic representation in the programming process embeds biases in algorithms, though initiatives committed to diversity are on the rise, such as the Algorithmic Justice League, founded by Canadian computer scientist Joy Buolamwini.
But when there's a profit to be made from music created with data sets that have been trained on existing discographies, who gets a cut? Given the industry's history of underpaying marginalized musicians, there's inherently a racialized effect in AI. Similarly, AI text generators are becoming more commonly used to produce lyrics that emulate those made by humans. These platforms rely on deep learning to generate strings of words from a database of existing songs and texts. GPT-3, an advanced deep learning-based language model developed by Open AI, is a buzzworthy app that is mostly trained on English data sets of texts curated by Western contributors. This brand of lyrics is doomed to amplify exclusionary practices.
Suzanne Kite, an Oglála Lakȟóta musician, performance artist and co-owner of label Unheard Records, asserts the need to adapt AI frameworks to include an Indigenous worldview. Kite creates computational instruments and confrontational musical performances that entangle movements of the body with sound. Her work challenges the way we form relationships with machines.
"I'm very interested in the root problem, which is, who do we extend our humanity to?" explained Kite, currently based in Tulsa, Okla.
For a recent performance at the Blackwood Gallery in Mississauga, titled Pȟehíŋ Kiŋ Líla Akhíšoke (Her Hair was Heavy), Kite built an instrument she refers to as a hair-braid interface using Wekinator (an open-source software used to create musical tools) to trigger AI-generated texts. To turn out original lyrics, Kite fed Lakȟóta songs into GPT-2 (GPT-3's predecessor), which the platform's Western makeup was unable to decipher. There's a message to decode in the unintelligibility of the output.
"I put Lakȟóta songs in there but you can't read them," she explained. "They come out as weird vocables that seem to be mocking the music."
I learn constantly from Lakȟóta culture and the act of gift-giving, and how that is a mechanism for doing economic good and maintaining relationships. And reciprocity. If you can follow reciprocity the whole way, it's going to be fine.- Suzanne Kite
Kite, who is a PhD candidate at Concordia University and a 2019 Trudeau Scholar, is co-author of essays Making Kin With Machines and The Indigenous Protocol and Artificial Intelligence. In both, her argument is that Indigenous ways of creating and retaining knowledge are better equipped for building computer hardware. Kite points to the longstanding Lakȟóta philosophy of reciprocal action, which can be applied to the AI boom.
"I learn constantly from Lakȟóta culture and the act of gift-giving, and how that is a mechanism for doing economic good and maintaining relationships," she said. "And reciprocity," she added. "If you can follow reciprocity the whole way, it's going to be fine."
What does the future look like?
Recognizing the business opportunity, technology companies are tuning into the possibilities of AI in music. Montreal-based audio platform LANDR, founded by Pascal Pilon in 2013, masters half-a-million songs every month using AI, and has touched hits by artists Lady Gaga and Wiz Khalifa. Historically, musicians hand off the mastering process — the final step of post-production — to sound engineers, who play with a number of processors to unify sound and maintain sonic consistency across an album. Mastering is expensive, and LANDR wants to render the process more accessible. Using AI, its model analyzes the unmastered songs, and compares them to tracks that have been released in the past with similar production features.
"We're really trying to make it so that artists can focus on the creative side of things. We're removing the technical barriers," Pilon explained from his home office in Montreal.
LANDR has been criticized for the threat the platform presents to human engineers. Pascal claims that the company has worked hard to build trust within the industry: "There's always the perception that change is going to be threatening. But what's important is to never take away the job of the artist." This month, a new version of LANDR's mastering services will be released that will resemble the experience of working with a human engineer. After hearing the work that the software initially produces, you can provide feedback on what you'd like to hear more or less of, and how to change it — just as you would if you were giving directions to someone in the studio. The AI will then adjust the master accordingly.
Taking algorithms into their own hands, several boundary-pushing Canadian musicians and producers are teaming up with machines. In Calgary, Shaun Lodestar, a.k.a. HomeSick, has turned to AI to produce Isolation Tape, an audiovisual mixtape released in December 2020. Growing up taking piano lessons and later embedding himself in the West Coast electronic scene, Lodestar first dabbled in AI in late 2019 and became enthralled by the process, despite a steep learning curve. Consistent with his DIY ethos, he cut his teeth navigating pre-existing scripts on GitHub and taught himself the Python programming language. Isolation Tape draws on the musical influence of the high-BPM house music genre Chicago juke, delivering a trippy concoction of visceral and mesmeric sounds, produced entirely using AI tools. Magenta was used to generate different drum and synth patterns, while GPT-3 created original lyrical content after Lodestar inputted a collection of poetry.
Isolation Tape features just over 30 minutes of music, accompanied by ripples of frenetic visuals. In both these layers, Lodestar uses the AI sound and esthetic to emulate what he described in a recent Zoom interview as "the feeling that you get in a big empty space, when it's meant to be filled with people." To create a unique audiovisual experience for "The Double," he used a pre-trained neural network (a series of algorithms that mimic the way the human brain operates) that studied thousands of horse images to serve as reactive audio input.
"Over time, I was able to extrapolate the specific latent directions that correspond to whether the horse has a rider on it. I'm able to attach the kick drum to whether or not that is the case. So every time the kick drum hits, you'll see somebody appear on the horse," he explained.
There's going to be some really bland work made with AI for the same reason that there's a lot of bland work made by humans.- Shaun Lodestar
New technologies have pushed art forward, facilitating the birth of musical genres, from hip hop to techno. As we head toward a future in which AI will inevitably play a more prominent role, the technology's capabilities and intentions need to be deconstructed and understood. We need to build collective understandings and belief systems around the role technology plays in the way we create. Lodestar plans on integrating AI into his long-term practice, although there is still much to be done.
"There's going to be some really bland work made with AI for the same reason that there's a lot of bland work made by humans," he shared.
A well-oiled machine may be able to produce the perfect pop hit, but above all, it's going to take a human ear to like it.