What the future holds for sound creativity – MIT Media Lab


By Tod Machover and Charles Holbrow

In his brilliant and provocative 1966 essay, Prospects for recording, Glenn Gould proposed to elevate – forgive the pun – the elevator music of a wicked drone to a rewarding auditory workout. In his view, the pervasive presence of background noise could subversively lead listeners to be sensitive to the building blocks, structural forms and hidden meanings of music, making this art form the universal language of emotions that she was meant to be. In a not unrelated development, Gould had somewhat recently swapped the concert hall for the recording studio, an act echoed by the Beatles’ 1967 release of Sgt. Peppers’ Lonely Hearts Club Group, an album designed and produced in a multitrack recording studio and never intended to be played in concert. And while Gould’s dream of transformative elevator music never really came to fruition, it’s clear that from the 1940s to the 1960s – from Les Paul and Mary Ford innovative use of overdubs in How high the moon, to the birth of rock and roll with Chuck Berry’s “Maybelene” in 1955, then to Schaeffer, Stockhausen, Gould, The Beatles and many more – a whole new art form, made possible by recording and playing processing on magnetic tape, was born.

Today we are at a similar crossroads. Music streaming – and, in general, music distribution and networking via the Internet – has become the “elevator music” of our time, offering endless songs and sounds, all meant to be tailored to our tastes. and prepared to make social connections. But many of the current trends are not promising and can even be seen as leading to a degradation of the potential of music. Algorithmic curation is still primitive and often features paler – not bolder – versions of the music we’re supposed to love. Current machine learning techniques for music generation produce generic, composerless pieces that somehow sound like something, but never sound Great. And one could argue that the vast potential of the Internet as an artistic medium has not yet resulted in a new type of music, as powerfully different in form and content from what surrounds us as music on tape. magnetic was during a live performance. In fact, it seems that the internet and streaming have changed everything in music except the music itself.

The key to harnessing the power of streaming to create something truly new might be to turn the ubiquity and fluidity of the medium into an advantage. Can we significantly allow a given piece of music to transform and evolve with a different impact on each hearing? Can this mutability engage the imagination of artists in new ways? Can listeners – or even the entire environment – play an important collaborative role in building such a culture of “living music”? Several projects underway at the MIT Media Lab, where we work, are exploring various forms that dynamic music streaming could take.

The current paradigm – unchanged in the age of streaming – is to treat a static recording as the terminal, canonical version of a composition. But a mastered, unchanged, “finished” recording is in reality a limited representation of a composition. It’s not always what artists want either. John Cage and many others have invented many open forms to allow for multiple compositional (not just expressive) interpretations, and Pierre Boulez has revised most of his pieces from year to year, often without leaving a “final” version. After the Beatles stopped recording together in the early 1970s, John Lennon Recount George Martin that he was not satisfied with their catalog and wanted to re-record all the band never released (especially “Strawberry Fields”, apparently). And of course, before Edison first phonograph in 1877, each musical performance was unique by necessity and could never be repeated without variation.

When recorded music was primarily distributed on physical media, finalizing a recording was a critical step. Now that music is mainly distributed over the Internet, this constraint has been lifted. Music can now, again, be less about the main recording and more about the dialogue between artist and medium, artist and audience, or music and the world itself. Labels and artists have started to scratch the surface of what is possible. Consider the now common pattern: An artist releases a song, and if that song starts to catch on on social media, it’s quickly followed by an acoustic version, a music video, and then countless club remixes. This is a prime example of how a recording can change after it’s first published, but it’s currently the only option available within the narrow confines of popular streaming platforms.

In the future, artists will push the concept of evolutionary music much further. Instead of releasing a static recording, artists could release music that is dynamic, fluid and open to reinterpretation, remixing and re-imagining. That would undoubtedly thrive in many, well, streams – some of which we are currently working on.

A first example experiments with an open approach to music production. Today, conventional pop songs overlap dozens, hundreds, and even thousands of different sounds. Prior to the release of this song, the relative volume level of all parts is finalized in a studio in the “mixing” process, during which the song structure, instrumentation, and any additional audio effects are locked into place. up, resulting in a final disposition. In the conventional workflow, a mixing engineer is responsible for each tone, level, and effect setup for all separate parts. The techniques we are currently developing allow the engineer to share control of mixing and arrangement with intelligent algorithmic processes. The most obvious use of this type of music production software would be to train AI agents to do some of the simpler parts of the mixing process; for example, a software agent might learn to define the balance between the main vocal part of a song and the background. It can also help a musician or engineer prepare a song for release more efficiently. It does not allow a fundamentally different kind of music from the original model provided.

The most exciting potential comes from working towards an idea where the music is not the output of such a system, but is in fact the system itself. From this perspective, we could imagine and create a whole range of musical experiences that would not fit in with today’s streaming music paradigms and techniques.

To go beyond this “smart mix” model, Charles works on an “Evolving Media” environment, through which a musical composition evolves over time. In particular, it creates a feedback loop that causes a recording to constantly update itself based on how it is consumed and shared on the internet. To make this possible, it is redesign several existing technologies, from the software we use to record, synthesize and mix music; cloud servers that deliver content to listeners; as well as playback apps on listeners’ devices – interconnecting them all on a single iterative platform, enabling:

  • Notation and annotation by the artist to be grouped together as enhanced and hyperlinked liner notes.
  • The compositions could be updated or revised, either by the artists or algorithmically.
  • It becomes much more convenient for other artists to remix, cover and collaborate.
  • The system leaves behind a story of how the song evolved, a record of the songwriting process.
  • This “procedural” content could produce “infinite compositions” that evolve forever.
  • It could be that, as with Snapchat, only the current state of evolving membership is available to listeners or collaborators, and then gone forever, making forward evolving a core part – and only partially controllable. – of the composition itself.

Another example of an evolving and collaborative composition process is represented by the City symphony series, developed by Tod and his colleagues in the Opera of the Future group at MIT Media Lab. Started in collaboration with the Toronto Symphony Orchestra in 2013, these projects develop a sound portrait of a city using both “musical” and “found” sounds, and invite the creative participation of anyone who lives in this place and wishes to contribute. Using the shared experience of the place as a unifying element, the symphonies established an unusual dialogue between very diverse members of each community, from Perth To Alfalfa To Edinburgh, and of Philadelphia cream To Miami To Detroit, all brought together thanks to Tod’s compositional vision. Special mobile app have been developed for each city that allow the public to record the sounds they wish to contribute to the project. All of the sounds are geographically labeled and form a growing sound map of the city. Constellation The software automatically analyzes, organizes and color codes the collected sounds, bringing them together to be mixed by anyone online with a mouse or finger. These “city mixes” are in turn uploaded to be shared and transformed, creating an ever-changing city soundscape that can be incorporated into the final symphony. Many other applications and online tools have been specially designed for each city, such as Media scores and life online collaboration sessions – to facilitate the creative participation of the public. The next series of City Symphonies, currently in development, will expand the city model to countries, such as a very first collaboration between the citizens of South and North Korea, and a symphony of “global trade” for Dubai that will continue to evolve – publicly and via streaming – far into the future.

Although tools are currently being developed here at the Lab to intelligently automate the establishment of meaningful sonic connections between clips collected in a massive database, and between “loud” and “musical” sounds, which is normally done manually and impossible to achieve on a large scale, the Media Lab’s “Cognitive audio “project takes an even more radical approach. Musicians / scientists Ishwarya Ananthabhotla and David Ramsay are working on a system that helps generate ever-changing compositions, based on the intriguing sounds around us that we might not even notice. Using cutting-edge psychoacoustic research, auditory scene analysis, and auditory memory recall, their software can take hours of recording ambient sounds from the environment found, then automatically select and edit them. sounds that we are likely to find the most interesting and the most interesting. want to remember. Then, by measuring our mood using preference tests and biometric readings powered by machine learning algorithms, the system produces streaming audio experiences that turn everyday life into an emotional, personalized, musically relevant and enhancing journey. Memory.


Kenneth T. Shippee

Leave a Reply

Your email address will not be published.