Category Archives: Paper

AES61 Audio for Games

The 61th International Conference of the Audio Engineering Society on Audio for Games took place in London from 10 to 12 February. This is the fifth edition of the Audio for Games conference which features a mixture of invited talks and academic paper sessions. Traditionally a biennial event, by popular demand the conference was organised in 2016 again following a very successful 4th edition in 2015.

Christian Heinrichs presented work from his doctoral research with Andrew McPherson, discussing Digital Foley and introducing FoleyDesigner, which allows for effectively using human gestures to control sound effects models.

I presented a paper in the Synthesis and Sound Design paper session, on weapon sound synthesis and my colleague William Wilkinson presented work on mammalian growls, both of which can be found in the conference proceedings.  

Furthermore, Xavier Serra and Frederic Font presented the Audio Commons project and how the creative industries could benefit from and get access to content with liberal licenses

Along with presenting work at this conference, I was also involved as the technical coordinator and webmaster for the Audio for Games community.

More information about the conference can be found on the conference website.

DAFx Awards

During the DAFx conference dinner, awards for the best papers were announced. Honourable Mentions:

Second Place

Winner

As posted on the DAFx website – http://www.ntnu.edu/dafx15/

DAFx Day 3

Day three of the Digital Audio Effects Conference (DAFx15) began with an excellent introduction and summary of Wave Digital filters and Digital Wave Guides by Kurt Werner and Julius O. Smith from CCRMA, in which the current state of the art in physical modelling no nonlinearities was presented and some potential avenues for future exploration was discussed. Following on from this work was discussed

DAFx Day 2

Photo from http://www.ntnu.edu/web/dafx15/dafx15 Photo by Jørn Adde © Trondheim kommune

Day two of DAFx conference in Trondheim NTNU opened with Marije Baalmans keynote on the range of hardware and software audio effects and synthesisers are available to artists, and how different artists utilise these effects. This talk was focused primarily on small embedded systems that artists use, such as Arduino, Beaglebone Black and Raspberry Pi. Later in the day, some excellent work including:

DAFx Conference 2015

The DAFx conference began with a tutorial day, where Peter Svensson provided a fantastic summary of the State of the Art in sound field propagation modelling and virtual acoustics.

Slide from DAFx 15 Day 1

During lunch, as it was getting dark, the snow started, which unfortunately blocked our view on the Northern Lights that afternoon. Øyvind Brandtsegg & Trond Engum then discussed Cross adaptive digital audio effects and their creative use in live performance. He referenced existing work at Queen Mary as some of the state of the art in existing work, and then presented NUTU’s current work on Cross Adaptive Audio Effects. The workshop day was rounded off with Xavier Serra discussing the Audio Commons project and use of open audio content.

 

The 139th Convention of the Audio Engineering Society in New York City

The weekend saw the 139th Convention of the Audio Engineering Society in Javits Convention Center in New York City. The annual American AES Convention is the world’s main event for all things audio, spanning a wide range of topics including loudspeaker design, music production, hearing aids, game audio and perception, and featuring a huge trade show as opposed to its less industry-heavy annual European counterparts.

A handful of C4DM delegates (Joshua D. Reiss, György Fazekas, Thomas Wilmering, David Moffat, David Ronan, and Brecht De Man) were each involved in multiple sessions.

Papers

T. Wilmering, G. Fazekas, Alo Allik and Mark B. Sandler, “Audio Effects Data on the Semantic Web” [Download paper]

D. Ronan, B. De Man, H. Gunes and J. D. Reiss, “The Impact of Subgrouping Practices on the Perception of Multitrack Music Mixes” [Download paper]

Dave Ronan also presented at the Student Design Exhibition with a physical model of a sitar based on a dynamic delay line and the Karplus-Strong model.

Workshops and tutorials

Workshop W20: “Perceptual Evaluation of High Resolution Audio” (Joshua D. Reiss (chair), Bob Katz, George Massenburg and Bob Schulein)

Tutorial T21: “Advances in Semantic Audio and Intelligent Music Production” (Ryan Stables (chair), Joshua D. Reiss, Brecht De Man and Thomas Wilmering)

Workshop W26: “Application of Semantic Audio Analysis to the Music Production Workflow” (György Fazekas (co-chair), Ryan Stables (co-chair), Jay LeBoeuf and Bryan Pardo)

Other events

Brecht De Man and Dave Moffat were responsible for the organisation of the entire Student and Career Development track as the Chair and Vice Chair of the Student Delegate Assembly (Europe and International Regions). These events include a student party (this edition at NYU’s James L. Dolan’s Music Recording Studio), Student Recording Competition, Student Design Competition, and a very successful edition of the Education and Career Fair.

Dave Ronan represented Queen Mary at the latter, discussing the various taught and research courses with an emphasis on the new MSc in Sound and Music Computing and handing out a lot of QM swag.

Committees

High Resolution Audio Technical Committee: Josh

Semantic Audio Analysis Technical Committee: György and Thomas

Education Committee: Dave Moffat and Brecht

Josh also serves as a member of the Board of Governors of the AES.


Upcoming AES events with a C4DM presence

AES UK Analogue Compression – Theory and Practice at British Grove Studios, London, UK (12 November 2015) Members only
Organised by Brecht and 2014-2015 MSc student Charlie Slee

AES UK Audio Signal Processing with E-Textiles at Anglia Rusking University, Cambridge, UK (26 November 2015)
By Becky Stewart (PhD graduate and visiting lecturer)

60th Conference on Dereverberation and Reverberation of Audio, Music, and Speech (DREAMS in Leuven, Belgium (3-5 February 2015)
Several C4DM papers including
David Moffat and Joshua D. Reiss. “Dereverberation and its application to the blind source separation problem”. In Proc. Audio Engineering Society Conference: 60th International Conference: DREAMS (Dereverberation and Reverberation of Audio, Music, and Speech). Audio Engineering Society, February 2016.

61st Conference on Audio for Games in London, UK (10-12 February 2015)
Brecht and Dave on committee, C4DM papers submitted

140th Convention of the Audio Engineering Society in Paris, France (4-7 June 2016)
If you are attending as a student (undergraduate, master, PhD), please get in touch with Brecht or Dave, and consider submitting a project to the Student Design Competition or Student Recording Competition to receive feedback from industry experts and prizes.


For any questions about the Audio Engineering Society regarding e.g. membership, publications, and local events, please contact Brecht (Chair of the Student Delegate Assembly, Chair of the London UK Student Section, and Committee Member of the British Section) or Dave (Vice Chair of the Student Delegate Assembly).

Dereverberation

I have been accepted to publish my MSc project on Dereverberation applied to Microphone Bleed Reduction.

I implemented existing research in reverb removal and combined with with a method for microphone interference reduction. In any multiple source environment there will interference from opposing microphones as pictured below.2s2m

Research at Queen Mary University allows this interference to be reduced in real time processing and my project was to improve this with the addition of removing natural acoustic reverberation in real time, to assist with the microphone bleed reduction.

This work will be published at the AES conference on DREAMS (Dereverberation and Reverberation of Audio Music and Speech).

David Moffat and Joshua D. Reiss. “Dereverberation and its application to the blind source separation problem”. In Proc. Audio Engineering Society Conference: 60th International Conference: DREAMS (Dereverberation and Reverberation of Audio, Music, and Speech). Audio Engineering Society, 2016. to appear.

Audio Feature Extraction Toolboxes

The features available within a list of ten audio feature extraction toolboxes is presented, and a list of unique features is created. Each tool is then compared to the total list of unique features. Each tool is also evaluated based on the feature coverage when compared to the MPEG-7 and Cuidado standard feature sets. The relative importance of audio features is heavily context based. To provide a meaningful measure of the relative importance of audio features within each toolbox, the toolboxes will be compared to their compliance with the MPEG-7 and Cuidado standards. The results of this can be seen below.

	Total No Features	Features in MPEG	Features in Cuidado YAAFE	10.45%	37.50%	44.44% MIRToolbox	20.56%	87.50%	85.19% Essentia	52.26%	100.00%	94.44% LibXtract	21.60%	87.50%	74.07% Meyda	6.27%	37.50%	20.37% Librosa	11.50%	37.50%	35.19% Marsyas	5.23%	25.00%	18.52% jAudio	13.94%	31.25%	35.19% TimbreToolbox	8.71%	37.50%	74.07% Aubio	3.83%	31.25%	18.52%

 

The accuracy of these audio features is presented here: https://github.com/craffel/mir_eval

Further information and detailed analyses will be presented in my upcoming paper:

David Moffat, David Ronan and Joshua D. Reiss, “An Evaluation of Audio Feature Extraction Toolboxes,” In Proc. 18th International Conference on Digital Audio Effects (DAFx-15), November 2015, to appear.

 

Audio Feature Extraction Toolboxes

I have recently been working on Evaluation of Audio Feature Extraction Toolboxes. I have had a paper accepted to DAFx on the subject. While there are a range of ways to analyse and each feature extraction toolbox, the computational time can be an effective evaluation metric. Especially when people within the MIR community are looking at larger and larger data sets. 16.5 Hours of audio, 8.79Gbs of audio, was analysed, and the MFCC’s using eight different feature extraction toolboxes. The computation time for every toolbox was captured, and can be seen in the graph below.

Time(s) Aubio	742 Essentia	252 jAudio	840 Librosa	3216 LibXtract	395 Marsyas	526 MIR	1868 YAAFE	211
The MFCCs were used, as they are a computational method, that exists within nine of the ten given tool boxes, and so should provide a good basis for comparison of computational efficiency. The MFCCs were all calculated with a 512 sample window size and 256 sample hop size. The input audio is at a variety of different sample rates and bit depths to ensure that variable input file formats is allowable by the feature extraction tool. This test is run on a MacBook Pro 2.9GHz i7 processor and 8Gb of RAM.

More information will be available in my upcoming paper “An Evaluation of Audio Feature Extraction Toolboxes” which will be published at DAFx-15 later this year.

Publication

It was announced today that my paper “Web Audio Evaluation Tool: A Browser-Based Listening Test Environment” has been accepted for the Sound and Music Computing Conference taking place in August.

This paper is a web audio API based development that provides users with a simple interface to construct and run perceptual audio evaluation experiments. Due to being browser based, there is no requirement for proprietary software, and it is easily extendable to any individual that can access a computer. It is set up so that no internet connection is required to run locally, or it can be hosted and post the XML output to a server.

This framework will give me an infrastructure with which to base all of me perceptual evaluation experiments which I will be undertaking in the next month or so.