Articles & Blogs


The First Song In The World Made Entirely Through Artificial Intelligence


Text, melody, harmony, rhythm, accompaniment, arrangement and singing performance. But not only that, even blockchain, virtual and holographic singers.

A SIDI (Swiss Institute for Disruptive Innovation) team has succeeded in demonstrating how it is now possible to use artificial intelligence (AI) to compose, arrange and perform a new piece of music, starting from scratch and sending a single input to the system (the desired genre of the piece).

It is not just an exercise in computer technology, what they are working on could be the beginning of a new approach to musical show business.

In practice, SIDI has simulated the connection of different open source and proprietary modules with specific functions, creating a "closed" system capable of autonomously generating a song. To this process they applied a deep learning system which, through enhancement cycles, is able to constantly improve the result according to the listener's feedback.


We interviewed Pietro Veragouth, creator and coordinator of the project. Mr. Veragouth, what is the purpose of this project?

In reality it is the first step necessary for the development of a completely autonomous system capable of composing and executing hyper-personalized music, which is able to please anyone in the true sense of the word.

A "machine" that knows how to do this, is it possible?

In this first phase we have put together the necessary pieces to make the system create a song out of nothing and improve the result. It is a circular process that has an intrinsic limitation: the fact that a piece cannot be objectively and universally beautiful. The system can therefore be supplied with general parameters within rather wide ranges. As I compose music as a hobby, I realize how a small variation can distort the result. To this a whole series of considerations must be added that go beyond the ability to generate a sequence of harmonious sounds, and which are psychological and cultural but also linked to factors such as previous musical experience and to the different conditions that, to a greater or lesser extent, predispose the listener to it.


Can you explain the concept better?

When we listen to music, different chemical reactions are triggered in our brain. The moment a specific segment of a song gives us a feeling of pleasure, it means that dopamine has been produced in particular. Through functional magnetic resonance or - although it has a lower degree of resolution - with an electroencephalograph, we can measure this activity in real time and determine which part of the piece has induced, both consciously and unconsciously, the release of the neurotransmitter, in which area of the brain and to what extent.

In the same way as a sentence in a text, the exact same sequence of notes can be found in thousands of different songs, yet they produce no effect in us. This is because the sensation of pleasure, as happens during sneezing or orgasm, develops substantially in two phases: the first generates tension, which in reality is a state of even increasing discomfort, followed by the liberating state which, in fact, induces the dopamine production and the feeling of pleasure. As I said before, however, there are other factors that influence this process, such as the fame of the artist, the fact that someone that we consider an opinion leader likes the piece, the connection of a sequence of sounds to an unconscious memory previously fixed in the synapses and so on.

The objectives we have set ourselves for the second phase of this project are twofold: the first is to demonstrate that, by applying this type of analysis to the created system, it is possible to create virtually perfect music for each individual listener. The second is instead to "open" the system to the outside, allowing generic or selected users to collaborate in improving the song. This is possible because users will be able to give their feedback through the network following a particular protocol that we are perfecting.

But if the song is created by the computer and improved by the users, who is the author? and who will own the property?

This is in fact an important issue on which we have leaned. If a song is put on the net even for the sole purpose of testing its goodness, there is the risk that it is voluntarily or even involuntarily "stolen" (many pieces remain in the memory of an artist who, unknowingly, thinks he was the true author of it). The hypothesis of using traditional certification systems within a circular process like the one I described earlier would be impractical. Hence the idea of ​​using blockchain. In this way, at the very moment in which the piece is conceived (or improved), it is associated with an absolutely unalterable "proof of existence". The attribution of the authorship of the piece will therefore be up to the initial author (or he who simply starts the process) who can optionally accredit the work to the other contributors and establish the use license, for example, by adopting one of the models provided by the creative commons.

So what can we expect in this area in the future?

What we are actually creating is a sort of ecosystem that also embraces other technologies, such as holograms and virtual reality. With the help of a holographic projector (a technology we have been working on for some years and for which I hold two patents), it is in fact possible to reproduce a singer or an entire band on stage and create a concert similar to a real one, an innovation that is already a successful reality in many Asian countries. On the virtual reality front, which presents itself as a huge megatrend, we are developing a module for the generation, always on the previous model, of virtual performers who, thanks to AI, are able to sing, dance, give interviews and interact with their fans.

Is it an exaggeratedly virtual reality that you envisage?

I know it may seem excessive but, to be honest, the chances of this failing to materialize, although perhaps in somewhat different terms, I think are extremely low. It's just a matter of time, but not much. Those who are immersed in innovation and are in contact with the new generations must only add two plus two. To make a historical comparison we could relate the gramophone to virtual reality (with the advent of the gramophone the performer-listener relationship was completely depersonalized) and the synthesizer of the 80s to artificial intelligence (the synthesizer almost completely replaced the figure of the musician and made the range of sounds and sound effects infinite).

The Song:

What we see in this video is a very short, very rough preview we are working on of the face of a virtual singer animated by AI

Made by:

Swiss Institute of Disruptive Innovation (SIDI)


Go Back