Category Archives: Programming

AES61 Audio for Games

The 61th International Conference of the Audio Engineering Society on Audio for Games took place in London from 10 to 12 February. This is the fifth edition of the Audio for Games conference which features a mixture of invited talks and academic paper sessions. Traditionally a biennial event, by popular demand the conference was organised in 2016 again following a very successful 4th edition in 2015.

Christian Heinrichs presented work from his doctoral research with Andrew McPherson, discussing Digital Foley and introducing FoleyDesigner, which allows for effectively using human gestures to control sound effects models.

I presented a paper in the Synthesis and Sound Design paper session, on weapon sound synthesis and my colleague William Wilkinson presented work on mammalian growls, both of which can be found in the conference proceedings.  

Furthermore, Xavier Serra and Frederic Font presented the Audio Commons project and how the creative industries could benefit from and get access to content with liberal licenses

Along with presenting work at this conference, I was also involved as the technical coordinator and webmaster for the Audio for Games community.

More information about the conference can be found on the conference website.

DAFx Awards

During the DAFx conference dinner, awards for the best papers were announced. Honourable Mentions:

Second Place


As posted on the DAFx website –

DAFx Day 2

Photo from Photo by Jørn Adde © Trondheim kommune

Day two of DAFx conference in Trondheim NTNU opened with Marije Baalmans keynote on the range of hardware and software audio effects and synthesisers are available to artists, and how different artists utilise these effects. This talk was focused primarily on small embedded systems that artists use, such as Arduino, Beaglebone Black and Raspberry Pi. Later in the day, some excellent work including:

Audio Feature Extraction Toolboxes

The features available within a list of ten audio feature extraction toolboxes is presented, and a list of unique features is created. Each tool is then compared to the total list of unique features. Each tool is also evaluated based on the feature coverage when compared to the MPEG-7 and Cuidado standard feature sets. The relative importance of audio features is heavily context based. To provide a meaningful measure of the relative importance of audio features within each toolbox, the toolboxes will be compared to their compliance with the MPEG-7 and Cuidado standards. The results of this can be seen below.

	Total No Features	Features in MPEG	Features in Cuidado YAAFE	10.45%	37.50%	44.44% MIRToolbox	20.56%	87.50%	85.19% Essentia	52.26%	100.00%	94.44% LibXtract	21.60%	87.50%	74.07% Meyda	6.27%	37.50%	20.37% Librosa	11.50%	37.50%	35.19% Marsyas	5.23%	25.00%	18.52% jAudio	13.94%	31.25%	35.19% TimbreToolbox	8.71%	37.50%	74.07% Aubio	3.83%	31.25%	18.52%


The accuracy of these audio features is presented here:

Further information and detailed analyses will be presented in my upcoming paper:

David Moffat, David Ronan and Joshua D. Reiss, “An Evaluation of Audio Feature Extraction Toolboxes,” In Proc. 18th International Conference on Digital Audio Effects (DAFx-15), November 2015, to appear.


Audio Feature Extraction Toolboxes

I have recently been working on Evaluation of Audio Feature Extraction Toolboxes. I have had a paper accepted to DAFx on the subject. While there are a range of ways to analyse and each feature extraction toolbox, the computational time can be an effective evaluation metric. Especially when people within the MIR community are looking at larger and larger data sets. 16.5 Hours of audio, 8.79Gbs of audio, was analysed, and the MFCC’s using eight different feature extraction toolboxes. The computation time for every toolbox was captured, and can be seen in the graph below.

Time(s) Aubio	742 Essentia	252 jAudio	840 Librosa	3216 LibXtract	395 Marsyas	526 MIR	1868 YAAFE	211
The MFCCs were used, as they are a computational method, that exists within nine of the ten given tool boxes, and so should provide a good basis for comparison of computational efficiency. The MFCCs were all calculated with a 512 sample window size and 256 sample hop size. The input audio is at a variety of different sample rates and bit depths to ensure that variable input file formats is allowable by the feature extraction tool. This test is run on a MacBook Pro 2.9GHz i7 processor and 8Gb of RAM.

More information will be available in my upcoming paper “An Evaluation of Audio Feature Extraction Toolboxes” which will be published at DAFx-15 later this year.

Hoxton Owl

The Hoxton Owl is a programmable guitar effects pedal developed around an ARM Cortex M4 chip. The pedal is fully programmable, to allow users to create any custom patch that they require.

Recently, I have been developing some basic patches for the Owl, which can be found in the owl patch library – Developing some basic patches in C, with use of my DSP knowledge.

The Owl is a stable, reliable and fun piece of hardware to use and allows users an infinite number of different effects which can be bespoke designed for any specific applications.

Audio Feature Extraction

Over the past few weeks, I have been working on evaluating a range of audio feature extraction tools. When I first started this project, I thought everyone uses the same features, so this should be easy – I was wrong.

I evaluated ten different feature extraction tools

  • Aubio
  • Essentia
  • jAudio
  • Librosa
  • LibXtract
  • Marsayas
  • Meyda
  • MIR Toolbox
  • Timbre Toolbox
I discovered that from this set of toolboxes, there are only three features in common to all toolboxes –
spectral centroid spectral rolloff and signal energy.
In fact, of over 250 unique features, just 30 are present in more than half of the feature extraction tools.