Category Archives: Research

Dereverberation

I have been accepted to publish my MSc project on Dereverberation applied to Microphone Bleed Reduction.

I implemented existing research in reverb removal and combined with with a method for microphone interference reduction. In any multiple source environment there will interference from opposing microphones as pictured below.2s2m

Research at Queen Mary University allows this interference to be reduced in real time processing and my project was to improve this with the addition of removing natural acoustic reverberation in real time, to assist with the microphone bleed reduction.

This work will be published at the AES conference on DREAMS (Dereverberation and Reverberation of Audio Music and Speech).

David Moffat and Joshua D. Reiss. “Dereverberation and its application to the blind source separation problem”. In Proc. Audio Engineering Society Conference: 60th International Conference: DREAMS (Dereverberation and Reverberation of Audio, Music, and Speech). Audio Engineering Society, 2016. to appear.

Audio Feature Extraction Toolboxes

The features available within a list of ten audio feature extraction toolboxes is presented, and a list of unique features is created. Each tool is then compared to the total list of unique features. Each tool is also evaluated based on the feature coverage when compared to the MPEG-7 and Cuidado standard feature sets. The relative importance of audio features is heavily context based. To provide a meaningful measure of the relative importance of audio features within each toolbox, the toolboxes will be compared to their compliance with the MPEG-7 and Cuidado standards. The results of this can be seen below.

	Total No Features	Features in MPEG	Features in Cuidado YAAFE	10.45%	37.50%	44.44% MIRToolbox	20.56%	87.50%	85.19% Essentia	52.26%	100.00%	94.44% LibXtract	21.60%	87.50%	74.07% Meyda	6.27%	37.50%	20.37% Librosa	11.50%	37.50%	35.19% Marsyas	5.23%	25.00%	18.52% jAudio	13.94%	31.25%	35.19% TimbreToolbox	8.71%	37.50%	74.07% Aubio	3.83%	31.25%	18.52%

 

The accuracy of these audio features is presented here: https://github.com/craffel/mir_eval

Further information and detailed analyses will be presented in my upcoming paper:

David Moffat, David Ronan and Joshua D. Reiss, “An Evaluation of Audio Feature Extraction Toolboxes,” In Proc. 18th International Conference on Digital Audio Effects (DAFx-15), November 2015, to appear.

 

Audio Feature Extraction Toolboxes

I have recently been working on Evaluation of Audio Feature Extraction Toolboxes. I have had a paper accepted to DAFx on the subject. While there are a range of ways to analyse and each feature extraction toolbox, the computational time can be an effective evaluation metric. Especially when people within the MIR community are looking at larger and larger data sets. 16.5 Hours of audio, 8.79Gbs of audio, was analysed, and the MFCC’s using eight different feature extraction toolboxes. The computation time for every toolbox was captured, and can be seen in the graph below.

Time(s) Aubio	742 Essentia	252 jAudio	840 Librosa	3216 LibXtract	395 Marsyas	526 MIR	1868 YAAFE	211
The MFCCs were used, as they are a computational method, that exists within nine of the ten given tool boxes, and so should provide a good basis for comparison of computational efficiency. The MFCCs were all calculated with a 512 sample window size and 256 sample hop size. The input audio is at a variety of different sample rates and bit depths to ensure that variable input file formats is allowable by the feature extraction tool. This test is run on a MacBook Pro 2.9GHz i7 processor and 8Gb of RAM.

More information will be available in my upcoming paper “An Evaluation of Audio Feature Extraction Toolboxes” which will be published at DAFx-15 later this year.

AES Workshop on Intelligent Music Production

The 8th September 2015 sees the Audio Engineering Society UK Midlands Section presenting a workshop on Intelligent Music Production at Birmingham City University.

As ever, C4DM have a strong presence at this workshop, as two of the six presented talks are by current C4DM members. Ryan Stables, the event organiser, and others at the Digital Media Technology (DMT) Lab in Birmingham City University are currently collaborating with C4DM on the Semantic Audio Feature Extraction (SAFE) project. More information on this project can be found here

Josh Reiss will present a summary of the current state of the art in Intelligent Music Production, highlighting current research directions and the implications of this technology. Brecht De Man will present some of his PhD results in perceptual evaluation of music production as he attempts to understand how mix engineers carry out their work. Further to this, Alex Wilson was a previous C4DM visiting student for six months, and will be presenting his recently publishing work from Sound and Music Computing Conference, in navigating the mix space.

More information on the workshop, including abstracts and registration, can be found here http://www.aes-uk.org/forthcoming-meetings/aes-midlands-workshop-on-intelligent-music-production/.

Listening In The Wild

Today, 28th August 2015, C4DM presented a one day workshop entitled Listening In The Wild, organised by Dan Stowell, Bob Sturm and Emmanouil Benetos.

The morning session presented a range of research including sound event detection using NMF and DTW techniques, understanding detectability variations of species and habitats, animal vocalisation synthesis through probabilistic models.

The post lunch session saw discussion on vocal modelling and analysis working towards understanding how animals produce their given associated sounds. Following this there was further discussion on NMF followed by work on using bird songs as part of a musical composition.

The poster session included work on auditory scene analysis, bird population vocalisation variations, CHiME: a sound source recognition dataset, technology assisted animal population size measures, bird identification through the use of identity vectors, and DTW for bird song dissimilarity.

Further information on the presenters and posters is avaliable here

Upcoming Events

There are a range of interesting and exciting events that are upcoming in the field audio technology, including:

Listening in the Wild – A machine listening workshop hosted at Queen Mary University on the 25th of June. This will discuss how animals and machines can listen to complex soundscapes. More information here: http://www.eecs.qmul.ac.uk/events/view/listening-in-the-wild-animal-and-machine-hearing-in-multisource-environment

Intelligent Music Production – A workshop presented at Birmingham City University on the 8th September on the current state of the art in audio production technology, perception and future implications. Details are here: http://www.aes-uk.org/forthcoming-meetings/aes-midlands-workshop-on-intelligent-music-production/

Both of these events are free to attend, and promise to look very exciting indeed.

Visiting Researcher

Over the past month, I have been working closely with visiting researcher Luca Turchet [http://www.lucaturchet.it/].

We have been working on perceptual evaluation of synthesised footstep sounds. Within the experiment that we ran, participants put on shoes with sensors mounted in them. The sound of different floor surfaces and shoe types is then synthesised, through quality noise blocking headphones, and the participants are then asked to shape the spectral content with the aid of some very basic audio filters.

The intended outcome is to identify the extent to which different participants will vary the spectral characteristics of their footsteps.

Further updates on this research to follow.

Publication

It was announced today that my paper “Web Audio Evaluation Tool: A Browser-Based Listening Test Environment” has been accepted for the Sound and Music Computing Conference taking place in August.

This paper is a web audio API based development that provides users with a simple interface to construct and run perceptual audio evaluation experiments. Due to being browser based, there is no requirement for proprietary software, and it is easily extendable to any individual that can access a computer. It is set up so that no internet connection is required to run locally, or it can be hosted and post the XML output to a server.

This framework will give me an infrastructure with which to base all of me perceptual evaluation experiments which I will be undertaking in the next month or so.