Category Archives: DSP

Audio Feature Extraction Toolboxes

I have recently been working on Evaluation of Audio Feature Extraction Toolboxes. I have had a paper accepted to DAFx on the subject. While there are a range of ways to analyse and each feature extraction toolbox, the computational time can be an effective evaluation metric. Especially when people within the MIR community are looking at larger and larger data sets. 16.5 Hours of audio, 8.79Gbs of audio, was analysed, and the MFCC’s using eight different feature extraction toolboxes. The computation time for every toolbox was captured, and can be seen in the graph below.

Time(s) Aubio	742 Essentia	252 jAudio	840 Librosa	3216 LibXtract	395 Marsyas	526 MIR	1868 YAAFE	211
The MFCCs were used, as they are a computational method, that exists within nine of the ten given tool boxes, and so should provide a good basis for comparison of computational efficiency. The MFCCs were all calculated with a 512 sample window size and 256 sample hop size. The input audio is at a variety of different sample rates and bit depths to ensure that variable input file formats is allowable by the feature extraction tool. This test is run on a MacBook Pro 2.9GHz i7 processor and 8Gb of RAM.

More information will be available in my upcoming paper “An Evaluation of Audio Feature Extraction Toolboxes” which will be published at DAFx-15 later this year.

Hoxton Owl

The Hoxton Owl is a programmable guitar effects pedal developed around an ARM Cortex M4 chip. The pedal is fully programmable, to allow users to create any custom patch that they require.

Recently, I have been developing some basic patches for the Owl, which can be found in the owl patch library – http://hoxtonowl.com/patch-library/. Developing some basic patches in C, with use of my DSP knowledge.

The Owl is a stable, reliable and fun piece of hardware to use and allows users an infinite number of different effects which can be bespoke designed for any specific applications.

AES on Intelligent Music Production

Yesterday, the AES presented a workshop on Intelligent Music Production. The day started with a great discussion of the current state of the art of Intelligent Music Production, with strong indications as to where the future of the research will occur, provided by Josh Reiss. Hyunkook Lee presented some interesting work on 3D placement of sources in a mix, and how to separate tracks based on the perceived inherent height of different frequency bands. Brecht De Man discussed his PhD work on subjective evaluation of music mixing, and his path to understand how people go about producing their preferred mix of music, and how this is perceived by others.

Following this, Sean Enderby provided an energetic talk on a set of SAFE tools produced at BCU for attaching semantic terms to presets for a range of audio effect plugins. Alessandro Palladini from Music Group UK, presented their current work on “Smart Audio Effects for Live Audio Mixing” which included interesting work on multiple side chained and parameter reduced effects and new methods and tools to provide mix engineers, both in the studio and in live music scenarios. Their research is focused around providing an intuitive set of tools that remain perceptually relevant. Alex Wilson presented his work on how participants mix a song in a very simplified mix simulation and how the starting positions will impact the final mix that participants will produce.

Videos of all the presentations is available here: http://www.semanticaudio.co.uk/media/

Teaching Electronics

Today I was working down at Sutton High School, teaching basic electronics to high school pupils. They learnt how to wire up an Arduinitar, a arduino based electric synthesis guitar with analog and digital control. This is part of Queen Marys outreach program.

More information on the Arduinitar is avaliable here http://www.eecs.qmul.ac.uk/~andrewm/arduinitar.html

Audio Feature Extraction

Over the past few weeks, I have been working on evaluating a range of audio feature extraction tools. When I first started this project, I thought everyone uses the same features, so this should be easy – I was wrong.

I evaluated ten different feature extraction tools

  • Aubio
  • Essentia
  • jAudio
  • Librosa
  • LibXtract
  • Marsayas
  • Meyda
  • MIR Toolbox
  • Timbre Toolbox
  • YAAFE
I discovered that from this set of toolboxes, there are only three features in common to all toolboxes –
spectral centroid spectral rolloff and signal energy.
In fact, of over 250 unique features, just 30 are present in more than half of the feature extraction tools.