top of page

AI in the Music Industry – Part 11: Open AI and the GPT Technology

Writer: Peter Tschmuck Peter Tschmuck

Updated: Aug 2, 2024

In 2023, Open AI’s “ChatGPT” caused a sensation. For the first time, AI has arrived at the heart of society and become accessible to the public. GPT technology is not only the basis of the popular chat bot, but is also used in the AI software “Jukebox”, which can compose music on its own without human intervention and can even imitate the voices of well-known artists. In part 11 of the “AI in the Music Industry” series, we take a look behind the scenes of Open AI and GPT technology, starting with the first experiments that Open AI conducted at Ars Electronica in Linz in order to continue composing Gustav Mahler’s 10th Symphony with the help of AI in 2019.

AI in the Music Industry – Part 11: Open AI and the GPT Technology

Let us return to a world-famous unfinished symphonic work – Gustav Mahler’s 10th Symphony. Mahler conceived the symphony in five movements during his summer holidays in Toblach in South Tyrol in the summer of 1910, but only the first movement exists in its entirety as a draft score. The other four movements, the order of which had not yet been determined by the composer, exist in particell sketches, of which the first score sketches exist for the second and third movements. Preoccupied with other projects and his engagement in New York in the winter of 1910/11, Mahler was unable to complete the Tenth Symphony before his unexpected death on 18 May 1911.[1] Since then, there have been numerous attempts to complete it, the most famous being that of the British musician and musicologist Daryck Cooke, who produced four versions of a completion between 1960 and 1989.[2]

So the bar was set high when the company Open AI was commissioned by the 2019 Ars Electronica Festival in Linz to develop Mahler’s 10th Symphony with its AI, MuseNet. MuseNet is a neural network whose algorithm recognises harmony and rhythmic structures in pieces of music and learns to predict the following notes of a piece of music.[3] The AI continued the first ten notes of the introductory theme of the Tenth Symphony and was orchestrated by Ars Electronica Futurelab Director Ali Nikrang. The final result was presented on 6 September 2019 by the Bruckner Orchestra Linz under the direction of Markus Posch.[4]

With this project, Open AI wanted to demonstrate the power of its artificial intelligence. MuseNet uses the same AI technology as GPT-2. GPT stands for Generative Pre-Trained Transformer, and is modelled on a neural network that is trained with large amounts of data to write text on its own. The first version of GPT was described by Open AI in June 2018, which was still trained in a semi-supervised way with large amounts of text.[5] This was followed by GPT-2 in February 2019, which differed from the original version in that the training of the algorithm was unsupervised, and the AI no longer needed prefabricated language blocks to generate text, but used statistical probabilities to determine the next word.[6] The Mahler symphony was continued in a similar way. In May 2020, Open AI presented GPT-3, based on an artificial neural network with more than 175 billion parameters. GPT-2 was based on “only” 1.5 billion parameters.[7] GPT 3 was also the first commercial project of Open AI, which had previously operated as a kind of research laboratory for artificial intelligence. In June 2020, a programming interface was created to allow professional developers to access the software to create new applications.[8] Microsoft has taken this opportunity to exclusively license GPT 3 and invest $1 billion in Open AI to co-develop Azure AI supercomputing technology.[9]

GPT 3 was followed in March 2022 by the interim version GPT 3.5, which used more recent training data and included an internet browser. The real improvement, however, was that the AI could not only continue texts, but could also insert parts of text into existing texts.[10] GPT 3.5 gained popularity with ChatGPT, a chatbot that used artificial intelligence to communicate with users using text and images. ChatGPT was released by Open AI in November 2022 and immediately attracted media attention as it was the first time users could interact with an AI without prior knowledge. ChatGPT uses a dialogue format to answer questions and communicate with users.[11] However, it is also possible to create texts on challenging topics that are indistinguishable from those written by a human or an AI. This immediately drew the attention of critics, who warned of abuse and fraud using AI. In Italy, the data protection authority has even temporarily banned ChatGPT[12] and the European police agency Europol warned that criminals could use chatbots to spread false information and carry out phishing attacks more easily.[13]

However, the current version is GPT 4, released on 14 March 2023. GPT 4 can not only process text input, but also images to generate text output. Overall, GPT 4 is even more powerful than its predecessors and has been updated with new training data and used as the basis for the new version of ChatGPT.[14] Open AI will continue in developing the GPT technology over the next few years with billions of dollars in support from Microsoft, and has applied for trademark protection for GPT 5.[15] The open letter published in March 2023 by former Open AI co-founder Elon Musk, Apple co-founder Steve Wozniak and 33,000 other signatories so far, calling for a moratorium on further development of AI experiments beyond GPT 4, will not change this.[16]

You might ask, what does all this have to do with music, apart from the fact that ChatGPT can also be used to write song lyrics? As the MuseNet example showed at the beginning, GPT technology can also be used to compose music if the appropriate training data is available. At the end of April 2020, Open AI presented “Jukebox”. This is an artificial neural network trained with more than 1.2 million music samples to generate new pieces of music. The user only has to specify a style of music, a lyric or an artist, and the algorithm generates several sample songs. Once one is selected, Jukebox will finalise the piece of music. Depending on the length of the song, this can take several hours. The final result can then be played back purely instrumentally or with vocals.[17] Examples created by the Open AI team, such as Elvis Presley style rock ‘n’ roll.[18] or a pop song in Frank Sinatra style,[19] can be listened on Soundcloud. The audio quality of the samples is still quite poor, but it is amazing how well the AI can imitate the voices of deceased stars. It’s only a matter of time before Open AI has perfected the music generation AI, just like Chat GPT’s text generation.

Endnotes

[1] The background to the composition of Mahler’s 10th Symphony is taken from Karl-Josef Müller, 1989, Mahler. Leben, Werke, Dokumente, 2nd edition, Mainz: Serie Musik, Piper-Schott, pp 403-429.

[2] Jörg Rothkamm, 2007, “The Tenth Symphony: Analysis of its Composition and ‘Performing Versions'”, in Jeremy Barham (ed.), The Cambridge Companion to Mahler, Cambridge: University Press, 2007, pp 150–161.

[3] Open AI, “MuseNet”, n.d., accessed: 2024-04-15.

[4] Ars Electronica, “Mahler-Unfinished”, n.d., accessed: 2024-04-15.

[6] Open AI, “GPT 2 – Better language models and their implications”, February 14, 2019, accessed: 2024-04-15.

[7] GPT 3 was described in detail in a scientific paper commissioned by Open AI: Tom B. Brown et al., 2020, “Language Models are Few-Shot Learners”, arxiv.org, arXiv:2005.14165, accessed: 2024-04-15.

[8] Open AI, “OpenAI licenses GPT-3 technology to Microsoft”, September 22, 2020, accessed: 2024-04-15.

[10] Open AI, “New GPT-3 capabilities: Edit & insert”, March 15, 2022, accessed: 2024-04-15.

[11] Open AI, “Introducing ChatGPT”, November 30, 2022, accessed: 2024-04-15.

[12] The Verge, “ChatGPT returns to Italy after ban”, April 28, 2023, accessed: 2024-04-15.

[13] Europol, 2023, ChatGPT – The impact of Large Language Models on Law Enforcement, a Tech Watch Flash Report from the Europol Innovation Lab, Publications Office of the European Union, Luxembourg.

[14] Open AI, “GPT 4”, March 14, 2023, accessed: 2024-04-15.

[15] Cybernews, “OpenAI files trademark application for GPT-5”, August 30, 2023, accessed: 2024-04-15.

[16] Future of Life, “Pause Giant AI Experiments: An Open Letter”, March 22, 2023, accessed: 2024-04-15.

[17] Open AI, “Jukebox”, April 30, 2020, accessed: 2024-04-15.

[18] Soundcloud, “Rock, in the style of Elvis Presley – Jukebox”, April 30, 2020, accessed: 2024-04-15.

[19] Soundcloud, “Classic Pop, in the style of Frank Sinatra – Jukebox”, April 30, 2020, accessed: 2024-04-15.

Recent Posts

See All

Comments


IMBRA

WE RESEARCH. JOIN OUR MAILING LIST.

Thanks for submitting!

WE SOCIALIZE

WE ARE AVAILABLE FOR HIRE

Contact IMBRA Expert Pool 

bottom of page