CANTE: A software for automatic transcription of flamenco singing.

Given the resulting of scores in flamenco music, studies often rely on scant, labour-intensive manual transcriptions. CANTE is a tool for automatic transcription of flamenco singing developed in the context of the COFLA research project. The software extracts a symbolic note representation of the singing voice melody from a cappella or accompanied flamenco recordings.

The resulting automatic transcriptions are essential to a number of related MIR tasks, such as melodic similarity characterisation, similarity-based style recognition, singer identification or melodic pattern retrieval and can furthermore aid a broad variety of musicological studies.


N. Kroher and E. Gómez (In Press): Automatic Transcription of Flamenco Singing from Polyphonic Music Recordings.  IEEE Transactions on Audio Speech and Language Processing.


CANTE is available as a standalone command line tool for MacOSX (beta version) and as a PYTHON module (recommended).

Python module

CANTE is available as an open source Python module: Download the latest distribution.
The source code is available on GitHub.


  • CANTE depends on numpy/scipy.
  • Melody extraction uses the MELODIA algorithm and requires the essentia python bindings. If you wish to use a different pitch tracker you can provide a .csv file as input instead.


  • Download and unpack the latest distribution.
  • Navigate to the root folder and run: python install


  • Import the module: import cante
  • Basic usage syntax:
    cante.transcribe(filename, acc=True, f0_file=False, recursive=False)
  • Parameters:
    filename: path to the input file or folder.
    acc: True if accompaniment is expected, False for a cappella recordings.
    f0_file: True if a .csv file containing the fundamental frequency is provided.
    recursive: True for folder recursion.
  • The algorithm creates a .csv file containing the estimated note events corresponding to the singing voice melody, where each row corresponds to a note event as follows:
    note onset [seconds], note duration [seconds], MIDI pitch value;
  • Input is a .wav audio file with a sample rate of 44.1kHz and a bit depth of 16 Bits. Otherwise an error is raised.
  • If an f0 file is provided, the filename should be identical to the audio file, i.e. for test.wav, a file named test.csv should be located in the same folder. The required format matches the output of sonic visualizer and sonic annotator: The first column contains the time instants in seconds and the second column holds the corresponding pitch values in Hz. Zero or negative pitch values indicate unvoiced frames. Hop size is restricted to 128 samples for a sample rate of 44.1 kHz.
  • In recursive mode, the algorithm transcribes all .wav files in the provided folder path
  • For accompanied recordings (i.e. vocals + guitar), an additional contour filtering stage is applied. In this case, set acc = True
  • Three basic use cases are provided in the ./examples folder.

Command line tool

We currently provide a beta version command line tool for MacOSX. The .zip contains a README file explaining the standard usage. Download.

The source code is available on GitHub.