At the beginning of my PhD, I began to read the sound effect synthesis literature, and I quickly discovered that there was little to no standardisation or consistency in evaluation of sound effect synthesis models – particularly in relations to the sounds they produce. Surely one of the most important aspects of a synthetic system, is whether it can artifically produce a convincing replacement for what it is intended to synthesize. We could have the most intractable and relatable sound model in the world, but if it does not sound anything like it is intended to, then will any sound designers or end users ever use it?
There are many different methods for measuring how effective a sound synthesis model is. Jaffe proposed evaluating synthesis techniques for music based on ten criteria. However, only two of the ten criteria actually consider any sounds made by the synthesiser.
This is crazy! How can anyone know what synthesis method can produce a convincingly realistic sound?
So, we performed a formal evaluation study, where a range of different synthesis techniques where compared in a range of different situations. Some synthesis techniques are indistinguishable from a recorded sample, in a fixed medium environment. In short – Yes, we are there yet. There are sound synthesis methods that sound more realistic than high quality recorded samples. But there is clearly so much more work to be done…
Recently, at the Audio Mostly 2017 conference, my work with Rod Selfridge and Josh Reiss was published on Propellor Sound Synthesis. I was both published at the conference, on the conference organising committee, as a the webmaster and a member of the music team. More information is available here on the Intelligent Sound Engineering Blog, and an example of the propellor synthesis is available on youtube.
A taxonomy of sound effects is useful for a range of reasons. Sound designers often spend considerable time searching for sound effects. Classically, sound effects are arranged based on some key word tagging, and based on what caused the sound to be created – such as bacon cooking would have the name “BaconCook”, the tags “Bacon Cook, Sizzle, Open Pan, Food” and be placed in the category “cooking”. However, most sound designers know that the sound of frying bacon can sound very similar to the sound of rain (See this TED talk for more info), but rain is in an entirely different folder, in a different section of the SFx Library.
Our approach, is to analyse the raw content of the audio files in the sound effects library, and allow a computer to determine which sounds are similar, based on the actual sonic content of the sound sample. As such, the sounds of rain and frying bacon will be placed much closer together, allowing a sound designer to quickly and easily find related sounds that relate to each other.
Both Rod Selfridge and I are honoured to win the Best Paper Award, at the Sound and Music Computing Conference for our Paper on “Real-time Physical Model for Synthesis of Sword Swing Sound” – The paper is available here
For as long as digital audio has existed, there have been discussions as to sampling rate and bit depth. I have heard countless arguments between people of Analogue vs. Digital, 96kHz vs. 44.1kHz, 24 bit vs 16bit.
After numerous experiments and publications, discussions and tests on the subject, we seem to be getting towards the truth. In the June AES Journal, a new meta study on high resolution audio promises to identify what the biggest failing are in our experimental methods, how we can progress with research in this field and finally, what are the results of years of research in the field.
The 61th International Conference of the Audio Engineering Society on Audio for Games took place in London from 10 to 12 February. This is the fifth edition of the Audio for Games conference which features a mixture of invited talks and academic paper sessions. Traditionally a biennial event, by popular demand the conference was organised in 2016 again following a very successful 4th edition in 2015.
Christian Heinrichs presented work from his doctoral research with Andrew McPherson, discussing Digital Foley and introducing FoleyDesigner, which allows for effectively using human gestures to control sound effects models.
Day three of the Digital Audio Effects Conference (DAFx15) began with an excellent introduction and summary of Wave Digital filters and Digital Wave Guides by Kurt Werner and Julius O. Smith from CCRMA, in which the current state of the art in physical modelling no nonlinearities was presented and some potential avenues for future exploration was discussed. Following on from this work was discussed
Day two of DAFx conference in Trondheim NTNU opened with Marije Baalmans keynote on the range of hardware and software audio effects and synthesisers are available to artists, and how different artists utilise these effects. This talk was focused primarily on small embedded systems that artists use, such as Arduino, Beaglebone Black and Raspberry Pi. Later in the day, some excellent work including:
The DAFx conference began with a tutorial day, where Peter Svensson provided a fantastic summary of the State of the Art in sound field propagation modelling and virtual acoustics.
During lunch, as it was getting dark, the snow started, which unfortunately blocked our view on the Northern Lights that afternoon. Øyvind Brandtsegg & Trond Engum then discussed Cross adaptive digital audio effects and their creative use in live performance. He referenced existing work at Queen Mary as some of the state of the art in existing work, and then presented NUTU’s current work on Cross Adaptive Audio Effects. The workshop day was rounded off with Xavier Serra discussing the Audio Commons project and use of open audio content.