Skip to content

Music segmentation#

Info

  • Try this notebook in an executable environment with Binder.
  • Download this notebook here.

Introduction#

Music segmentation can be seen as a change point detection task and therefore can be carried out with ruptures. Roughly, it consists in finding the temporal boundaries of meaningful sections, e.g. the intro, verse, chorus and outro in a song. This is an important task in the field of music information retrieval.

The adopted approach is summarized as follows:

  • the original sound is transformed into an informative (multivariate) representation;
  • mean shifts are detected in this new representation using a dynamic programming approach.

In this example, we use the well-known tempogram representation, which is based on the onset strength envelope of the input signal, and captures tempo information [Grosche2010].

To load and manipulate sound data, we use the librosa package [McFee2015].

Setup#

First, we make the necessary imports.

import librosa
import librosa.display
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import Audio, display

import ruptures as rpt  # our package

We can also define a utility function.

def fig_ax(figsize=(15, 5), dpi=150):
    """Return a (matplotlib) figure and ax objects with given size."""
    return plt.subplots(figsize=figsize, dpi=dpi)

Load the data#

A number of music files are available in Librosa. See here for a complete list. In this example, we choose the Dance of the Sugar Plum Fairy from The Nutcracker by Tchaikovsky.

We can listen to the music as well as display the sound envelope.

duration = 30  # in seconds
signal, sampling_rate = librosa.load(librosa.ex("nutcracker"), duration=duration)

# listen to the music
display(Audio(data=signal, rate=sampling_rate))

# look at the envelope
fig, ax = fig_ax()
ax.plot(np.arange(signal.size) / sampling_rate, signal)
ax.set_xlim(0, signal.size / sampling_rate)
ax.set_xlabel("Time (s)")
_ = ax.set(title="Sound envelope")
Downloading file 'Kevin_MacLeod_-_P_I_Tchaikovsky_Dance_of_the_Sugar_Plum_Fairy.ogg' from 'https://librosa.org/data/audio/Kevin_MacLeod_-_P_I_Tchaikovsky_Dance_of_the_Sugar_Plum_Fairy.ogg' to '/home/runner/.cache/librosa'.