Music Generation Via A Hidden Markov Model - Part 2

cover
30 Jul 2024

Introduction

In Part 1 of this guide, you built a Jupyter Notebook to generate music sequences via a Hidden Markov Model (HMM). In this Part 2 guide, you will use the Signal Online MIDI Editor to listen to and interact with your generated music. As Signal is a purely web-based ( and lightweight) Digital Audio Workstation (DAW), you won't need to install any software. I will also include a handful of ideas on running additional music generation experiments.

Prerequisites and Before You Get Started

To complete the guide, you will need to have:

  • Completed the Part 1 guide along with your own set of generated music MIDI files.

Using Signal

Using Signal is very straightforward for our purposes here.

Loading and Playing a MIDI File

  • Open a browser and navigate to https://signal.vercel.app/.
  • Click the large blue Launch button.
  • In the upper-left corner, click on File to open the pop-up menu.
  • Click Open. When the file explorer window opens, select your MIDI file.
  • To play your generated music, click on the blue play icon button among the bottom edge controls or hit the Space Bar.
  • To reset the track to the beginning after playback, click on the white stop icon button among the bottom edge controls.

You can click on the small horizontal - and + buttons in the lower-right corner of the interface to adjust the horizontal zoom of the MIDI data display area. Additionally, you can click on the vertical - and + buttons along the right edge of the interface to adjust the vertical zoom.

Changing the Tempo

The tempo is the number you see next to BPM along the bottom of the interface. It should be at 120 when you load your file. To change the tempo, simply click on the number field and enter a new value. Lower values result in a slower tempo and higher values result in a faster tempo.

Changing the Playback Instrument

When you load your file, Signal will initialize the playback track with an Acoustic Grand Piano as the playback instrument. To change the instrument, click on Acoustic Grand Piano. This will open a modal where you can select different instruments across different categories.

Analysis of Part 1 Guide Experiments

Assuming you ran the Part 1 notebook on the four test observation sequences that I defined (and using the Schubert score that I modeled that write-up on), you will have four MIDI files that you can play and analyze using Signal. To reiterate from the Part 1 guide, you can access the experimental results via Kaggle using the Music Generation with GiantMIDI-Piano dataset, and specifically the hmm_experiments folder within that dataset. The hmm_experiments folder contains four sub-folders that include the results of the four experiments using a Hidden Markov Model (HMM) for music generation. Your results should be the same as those in the sub-folder if you followed the Part 1 guide as written.

While the following analysis is somewhat subjective - given that musical taste is subjective - here are my thoughts and insights:

  • Observation Sequence 1: I was encouraged by the beginning of the generated music based on this observation sequence. It started out sounding very musical but got "weird" around the 5th measure.

  • Observation Sequence 2: I was happy with the overall result from this observation sequence. It has something of a dramatic feel to it and I think it could easily be used with a longer composition.

  • Observation Sequence 3: The generated music from this sequence was "ok", but struck me in a somewhat neutral way - I didn't think it was all that bad, nor did I think it was all that good.

  • Observation Sequence 4: The result from this observations sequence was the worst of the batch in my opinion. It is discordant and not very musical. It is the kind of dissonant sequence you might hear in a B-lane horror film.

Overall, I found the results from the first and second observations sequences to be most appealing. At the risk of stating the obvious, I created the 4 test observations sequences in the Part 1 guide randomly. So, I had no idea what the HMM would output.

Additional Experiments

The four test sequences in the Part 1 guide, along with the decision to use Schubert's Fantasie in C major as our musical corpus, are just a starting point. There are any number of ways to modify the model and to run new experiments:

Different Sequences

The most obvious modification is to define your own sequence, which varies both in length and composition from what was included in the Part 1 guide. Bear in mind that longer sequences will result in smaller and smaller probability calculations when generating the Viterbi lattice.

Different Input Data

The second most obvious modification is to change the input data to the model. You might experiment with scores all written in the same musical key and time signature, along with scores written in different musical keys and/or time signatures. For example, I did some additional experimentation using the 3 Johann S. Bach compositions in the GiantMIDI-Piano/midis_for_evaluation/giantmidi-piano folder of the GiantMIDI-Piano GitHub repository as input data:

  • Bach_Prelude_and_Fugue_in_A-flat_major_BWV862_gCL5Zvnt0TU_a.mid

  • Bach_Prelude_and_Fugue_in_F-sharp_major_BWV_858_lJCpUW1Q1yc_a.mid

  • Bach_Prelude_and_Fugue_in_G-sharp_minor_BWV_863_9tezjkEkzW4.mid

As seen from the filenames, these 3 musical scores are all in different musical keys. I used the same test observation sequences from the Part 1 guide with these scores. One of the results was particularly interesting, especially when I used one of my commercial (i.e., not freely available, unfortunately) synthesizers for playback.

As mentioned in the Training Dataset sub-section of the Part 1 guide, I've uploaded 10,841 of the original dataset's 10,855 MIDI files to Kaggle. The Kaggle dataset can be accessed here. The 14 missing files are due to errors I encountered when unzipping the original dataset archive. Also, please note that the filenames in the Kaggle dataset are different as compared to the original GiantMIDI-Piano dataset. It was necessary to change the filenames to overcome upload errors due to illegal characters in several original filenames.

Adding the volume Property to Music Element Objects

You can modify the extract_musical_elements function defined in Step 2.7 of the Part 1 guide to include the volume property and value for note and chord elements. Dynamics is everything in music. For example, you will almost certainly never hear a classical pianist playing a composition at a single "volume". He or she will play some notes/chords softer and others louder - with many degrees of softness and loudness. In fact, the best pianists in the world have such control over the keyboard that they can press the keys with such precision and control as to generate almost any volume that they want. If that doesn't quite make sense, just trust me that it is an incredibly difficult thing to do and takes years of practice and training to master. :-) By adding the volume property to notes and chords, you will increase the size of the musical vocabulary and, thus the number of calculations required to build the HMM. However, it may also result in more musical results.

Manipulating Emission Matrix Probabilities

The emission matrix probabilities were defined logically:

  • The probability of a musical rest element emitting <REST> is 1 and 0 otherwise.

  • The probability of a musical note element emitting <NOTE> is 1 and 0 otherwise.

  • The probability of a musical chord element emitting <CHORD> is 1 and 0 otherwise.

These probabilities could be modified for specific musical elements to generate more interesting results from the HMM. For example, in the case of a given musical rest element, you could shave some probability mass from <REST> and assign it to <NOTE> and <CHORD>. You could apply the same idea to specific musical note and chord elements. In doing so, the musical elements of the hidden sequence generated by the HMM might not map "cleanly" to the observation sequence, but the musical result might nonetheless be an interesting one.

Choosing the Most Musical Path Over the Best Path

The HMM, as it is defined, chooses the best (i.e. highest probability) path from the Viterbi lattice. However, the logic could be modified to choose the most musical path. For example, a note or chord that doesn't belong to a particular musical key could be rejected as the Viterbi lattice is being read out. In such a scenario, a "better" choice can be made with respect to conditions set by the experimenter.

Conclusion

I hope you found this experiment using an HMM to generate music to be fun and interesting. As I mentioned at the outset of the Part 1 guide, you are obviously not doing anything here that approaches the capabilities of systems like Suno or Udio. But, as a musician myself, I do think a relatively simple system like what you have built here can be useful in terms of generating musical ideas and perhaps helping you jumpstart your creativity with respect to musical projects.

Also, it's likely you've already read one or two stories of clever users "tricking" applications like Suno and Udio into revealing how they were trained. Unsurprisingly, the answer appears to be on large quantities of copyrighted content, which suggests that these technologies, despite being absolutely amazing, may represent plagiarism at scale.

All of the music in the GiantMIDI-Piano dataset is in the public domain. As such, you can feel confident that any musical ideas you might generate against the corpus are uniquely yours and do not infringe on the rights of other artists. As always, thank you for reading, and happy building! :-)