My thesis has been submitted. It was submitted back in June, but it took me some time to get back to this. The title is “Perceptual Evaluation of Synthesised Sound Effects”, and a summary is available below
At the beginning of my PhD, I began to read the sound effect synthesis literature, and I quickly discovered that there was little to no standardisation or consistency in evaluation of sound effect synthesis models – particularly in relations to the sounds they produce. Surely one of the most important aspects of a synthetic system, is whether it can artifically produce a convincing replacement for what it is intended to synthesize. We could have the most intractable and relatable sound model in the world, but if it does not sound anything like it is intended to, then will any sound designers or end users ever use it?
There are many different methods for measuring how effective a sound synthesis model is. Jaffe proposed evaluating synthesis techniques for music based on ten criteria. However, only two of the ten criteria actually consider any sounds made by the synthesiser.
This is crazy! How can anyone know what synthesis method can produce a convincingly realistic sound?
So, we performed a formal evaluation study, where a range of different synthesis techniques where compared in a range of different situations. Some synthesis techniques are indistinguishable from a recorded sample, in a fixed medium environment. In short – Yes, we are there yet. There are sound synthesis methods that sound more realistic than high quality recorded samples. But there is clearly so much more work to be done…
Recently, at the Audio Mostly 2017 conference, my work with Rod Selfridge and Josh Reiss was published on Propellor Sound Synthesis. I was both published at the conference, on the conference organising committee, as a the webmaster and a member of the music team. More information is available here on the Intelligent Sound Engineering Blog, and an example of the propellor synthesis is available on youtube.
At the upcoming International Conference on Digital Audio Effects, I will be presenting my recent work on creating a sound effects taxonomy using unsupervised learning. A link to the paper can be found here.
A taxonomy of sound effects is useful for a range of reasons. Sound designers often spend considerable time searching for sound effects. Classically, sound effects are arranged based on some key word tagging, and based on what caused the sound to be created – such as bacon cooking would have the name “BaconCook”, the tags “Bacon Cook, Sizzle, Open Pan, Food” and be placed in the category “cooking”. However, most sound designers know that the sound of frying bacon can sound very similar to the sound of rain (See this TED talk for more info), but rain is in an entirely different folder, in a different section of the SFx Library.
Our approach, is to analyse the raw content of the audio files in the sound effects library, and allow a computer to determine which sounds are similar, based on the actual sonic content of the sound sample. As such, the sounds of rain and frying bacon will be placed much closer together, allowing a sound designer to quickly and easily find related sounds that relate to each other.
A full run down of the work is present on the Intelligent Audio Engineering Blog
It has been quite a while since I have posted, but I hope to resolve that shortly with a number of academic papers being published this summer,
In the meantime, there is some discussion over the use of sound effects in port production, and the fundamental fact that many things you hear as part of a soundscape are not the original recorded sound – this is the one of the fundamental justifications for my PhD and this is very well explained in this TED Talk: