Category Archives: Paper

Digital Audio Effects Conference 2018 (DAFx18)

Part of C4DM recently attended the 21st DAFx conference, hosted in `the Venice of Portugal’, Aveieo, Portugal. Aveiro is an excellent and picturesque town to host a conference in, and everyone there enjoyed it.

The conference started with a day of tutorials, all lead by experts in their respective field.
Julian Storer from (ROLI)[] presented a discussion on the use of his JUCE framework for developing plugins and DSP tools. The JUCE framework is continually growing, based on the needs of it’s user base and the internal processes of ROLI. Shahan Nercessian from (iZotope)[] then presented a whirlwind tour of machine learning tools and techniques and how they can be applied to audio, much of the advice was also derived from the image processing field, and there was a heavy focus on Deep Learning, and how it can be used. A tutorial on Digital Audio Filter design was then presented by Vesa Valimaki, of Alto University, Helsinki. The field of digital filters is a significant one, being described as “almost everything in audio, everything can be viewed as a filter”, though a clear and concise presentation of DSP from the fundamentals was presented. Perception and Virtual Reality were then discussed as part of Catarina Mendoncas presentation, where the cognitive factors relevant to VR audio were discussed.
Our own Josh Reiss opened the first day of the main conference, with a keynote talk on disruptive innovation in audio production. This talk overviewed a series of work performed by the (Audio Engineering research Group)[], that can and has influenced the field of audio production, from development of Intelligent Audio Production to questioning the utility in High Resolution Audio.

The day continued with the academic track. Papers were presented on a range of topics, and our selective highlights are

Thursday saw “Confessions of a plug-in Junkie” by David Farmer, who rigorously presented his approach to how he uses and busy plugins. This gave a useful insight from a user perspective, that is often not fully considered. Such as, a one week free demo is useless, as nobody will download it, unless they have an exact use case, which in a busy work environment is somewhat rare.

A range of academic work was then presented, including my paper

Yvan Grabit from Steinberg opened Friday, with a keynote talk on the VST standard audio plugin format, which let into another day of inspiring academic work, including:

Training CNNs to Apply Linear Audio Effects
Modal Modelling of Room Impulse Responses
Perceptual Latent Space Control of Timbre through CNNs
The DAFx conference this year, contained some high quality work, in an excellent location. DAFx will be hosted by Birmingham City University, UK in 2019.

Objective Evaluation of Synthesised Environmental Sounds

Having recently attended DAFx, for my 4th year, I was presenting my paper on Objective Evaluation of Synthesised Environmental Sounds. WhatsApp Image 2018-09-06 at 15.00.44


The basic premis of the paper, is that we can computational measure how similar two sounds are using an objective metric. This objective metric can be evaluated using an iterative resynthesis approach. And a given similarity score can be evaluated through comparison to human perception.ObjectiveEvalSynthesis



I hope this made sense, but if not please get in touch and I would be happy to explain further. The paper will be available on the DAFx Website shortly.

Sound Synthesis – Are we there yet?

TL;DR. Yes

At the beginning of my PhD, I began to read the sound effect synthesis literature, and I quickly discovered that there was little to no standardisation or consistency in evaluation of sound effect synthesis models – particularly in relations to the sounds they produce. Surely one of the most important aspects of a synthetic system, is whether it can artifically produce a convincing replacement for what it is intended to synthesize. We could have the most intractable and relatable sound model in the world, but if it does not sound anything like it is intended to, then will any sound designers or end users ever use it?

There are many different methods for measuring how effective a sound synthesis model is. Jaffe proposed evaluating synthesis techniques for music based on ten criteria. However, only two of the ten criteria actually consider any sounds made by the synthesiser.

This is crazy! How can anyone know what synthesis method can produce a convincingly realistic sound?

So, we performed a formal evaluation study, where a range of different synthesis techniques where compared in a range of different situations. Some synthesis techniques are indistinguishable from a recorded sample, in a fixed medium environment. In short – Yes, we are there yet. There are sound synthesis methods that sound more realistic than high quality recorded samples. But there is clearly so much more work to be done…

For more information, read the paper here

Propellor Sound Synthesis at Audio Mostly

Recently, at the Audio Mostly 2017 conference, my work with Rod Selfridge and Josh Reiss was published on Propellor Sound Synthesis. I was both published at the conference, on the conference organising committee, as a the webmaster and a member of the music team. More information is available here on the Intelligent Sound Engineering Blog, and an example of the propellor synthesis is available on youtube.

Sound Effects Taxonomy

At the upcoming International Conference on Digital Audio Effects, I will be presenting my recent work on creating a sound effects taxonomy using unsupervised learning. A link to the paper can be found here.

A taxonomy of sound effects is useful for a range of reasons. Sound designers often spend considerable time searching for sound effects. Classically, sound effects are arranged based on some key word tagging, and based on what caused the sound to be created – such as bacon cooking would have the name “BaconCook”, the tags “Bacon Cook, Sizzle, Open Pan, Food” and be placed in the category “cooking”. However, most sound designers know that the sound of frying bacon can sound very similar to the sound of rain (See this TED talk for more info), but rain is in an entirely different folder, in a different section of the SFx Library.

Our approach, is to analyse the raw content of the audio files in the sound effects library, and allow a computer to determine which sounds are similar, based on the actual sonic content of the sound sample. As such, the sounds of rain and frying bacon will be placed much closer together, allowing a sound designer to quickly and easily find related sounds that relate to each other.

A full run down of the work is present on the Intelligent Audio Engineering Blog

High Resolution Audio

For as long as  digital audio has existed, there have been discussions as to sampling rate and bit depth. I have heard countless arguments between people of Analogue vs. Digital, 96kHz vs. 44.1kHz, 24 bit vs 16bit.

After numerous experiments and publications, discussions and tests on the subject, we seem to be getting towards the truth. In the June AES Journal, a new meta study  on high resolution audio promises to identify what the biggest failing are in our experimental methods, how we can progress with research in this field and finally, what are the results of years of research in the field.

Intelligent Sound Engineering Blog

AES Journal Paper (Open Access)

AES61 Audio for Games

The 61th International Conference of the Audio Engineering Society on Audio for Games took place in London from 10 to 12 February. This is the fifth edition of the Audio for Games conference which features a mixture of invited talks and academic paper sessions. Traditionally a biennial event, by popular demand the conference was organised in 2016 again following a very successful 4th edition in 2015.

Christian Heinrichs presented work from his doctoral research with Andrew McPherson, discussing Digital Foley and introducing FoleyDesigner, which allows for effectively using human gestures to control sound effects models.

I presented a paper in the Synthesis and Sound Design paper session, on weapon sound synthesis and my colleague William Wilkinson presented work on mammalian growls, both of which can be found in the conference proceedings.  

Furthermore, Xavier Serra and Frederic Font presented the Audio Commons project and how the creative industries could benefit from and get access to content with liberal licenses

Along with presenting work at this conference, I was also involved as the technical coordinator and webmaster for the Audio for Games community.

More information about the conference can be found on the conference website.

DAFx Awards

During the DAFx conference dinner, awards for the best papers were announced. Honourable Mentions:

Second Place


As posted on the DAFx website –

DAFx Day 3

Day three of the Digital Audio Effects Conference (DAFx15) began with an excellent introduction and summary of Wave Digital filters and Digital Wave Guides by Kurt Werner and Julius O. Smith from CCRMA, in which the current state of the art in physical modelling no nonlinearities was presented and some potential avenues for future exploration was discussed. Following on from this work was discussed