Digital Music Research Network 2019

DMRN this year, was once again hosted at Queen Mary. Their annual workshop on digital music research. This year, I attended, representing University of Plymouth, my new employer. We presented a poster on RadioMe, the new research project.

There were a number of interesting presentations at DMRN, including discussion on source separation, musical loop extraction, audio effect impact on musical similarity metrics. However, for me the star of the day was Cynthia Liem’s keynote talk on ways in which we can understand and access data, that are totally different to common practices. It demonstrated the importance of ensuring we are measuring what we intend to, especially in a musical context.

Intelligent Music Mixing

There are a number of Approaches in Intelligent Music Production – as discussed in my recent journal paper (here) – but the key aspects to consider are

  • How can I interact with the intelligent musical tool?
  • How does it understand what I am doing?
  • How can I get it to change the sounds?

Until now, most intelligent music production has been all about automating the parameters of a single audio effect – eg. the cut-off frequency of an EQ filter – but – fundamentally, each individual aspect of the audio processing will clearly impact every other part of the mix – Why should we be restricted to traditional audio signal processing chains? Why should we rely on an EQ, which was first designed 70 years ago? Surely, with all the AI and intelligent systems in the world, computers can find much better – or even… just more interesting – ways of interacting with and shaping audio.

AES 145th Convention

I have recently returned from the AES 145th convention. There were a number of interesting and inspiring talks. New dynamics audio effects for the android platform, further understanding as to the importance of phase in source separation – which is particularly important when looking at the state of the art in source separation, as it is primarily deep learning, which often suffers from inherent phase problems.

There was an interesting paper from BCU, entitled “Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results” which promises to understand how a listening test interface and influence the results.

Empire State Building

Overall, for me, the highlight was seeing the launch of the new source separation company AudioSourcere. They produce state of the art VST plugins for source separation, allowing vocal removal, harmonic percussive separation, and reverb removal all in one very cost effective package. Derry, the academic who founded the company, is one of the world leading experts in source separation, and his product does not disappoint.

More information can be found here https://www.audiosourcere.com/

SoundStack 2018

Last month was SoundStack 2018 – This three day workshop on 3D and spatial sound was an excellent beginners introduction to 3D sound, and also a comprehensive overview as to ambisonic and 3D sound design, mixing and production.

The workshop, organised entirely by Angela McArthur, included highlights such as head tracking HRTF’s and ambisonics (As pictured below), use of Max MSP and the SPAT spatial audio system, and a talk and workshop with Call and Response, a 3D sound collective based in Deptford, London. People travelled from all over the country to attend the day, and it was thoroughly enjoyable.

PZ8A4178-1200x704

Digital Audio Effects Conference 2018 (DAFx18)

Part of C4DM recently attended the 21st DAFx conference, hosted in `the Venice of Portugal’, Aveieo, Portugal. Aveiro is an excellent and picturesque town to host a conference in, and everyone there enjoyed it.

The conference started with a day of tutorials, all lead by experts in their respective field.
Julian Storer from (ROLI)[JUCE.com] presented a discussion on the use of his JUCE framework for developing plugins and DSP tools. The JUCE framework is continually growing, based on the needs of it’s user base and the internal processes of ROLI. Shahan Nercessian from (iZotope)[izotope.com] then presented a whirlwind tour of machine learning tools and techniques and how they can be applied to audio, much of the advice was also derived from the image processing field, and there was a heavy focus on Deep Learning, and how it can be used. A tutorial on Digital Audio Filter design was then presented by Vesa Valimaki, of Alto University, Helsinki. The field of digital filters is a significant one, being described as “almost everything in audio, everything can be viewed as a filter”, though a clear and concise presentation of DSP from the fundamentals was presented. Perception and Virtual Reality were then discussed as part of Catarina Mendoncas presentation, where the cognitive factors relevant to VR audio were discussed.
Our own Josh Reiss opened the first day of the main conference, with a keynote talk on disruptive innovation in audio production. This talk overviewed a series of work performed by the (Audio Engineering research Group)[c4dm.qmul.ac.uk/audioengineering.html], that can and has influenced the field of audio production, from development of Intelligent Audio Production to questioning the utility in High Resolution Audio.

The day continued with the academic track. Papers were presented on a range of topics, and our selective highlights are

Thursday saw “Confessions of a plug-in Junkie” by David Farmer, who rigorously presented his approach to how he uses and busy plugins. This gave a useful insight from a user perspective, that is often not fully considered. Such as, a one week free demo is useless, as nobody will download it, unless they have an exact use case, which in a busy work environment is somewhat rare.

A range of academic work was then presented, including my paper

Yvan Grabit from Steinberg opened Friday, with a keynote talk on the VST standard audio plugin format, which let into another day of inspiring academic work, including:

Training CNNs to Apply Linear Audio Effects
Modal Modelling of Room Impulse Responses
Perceptual Latent Space Control of Timbre through CNNs
The DAFx conference this year, contained some high quality work, in an excellent location. DAFx will be hosted by Birmingham City University, UK in 2019.

Objective Evaluation of Synthesised Environmental Sounds

Having recently attended DAFx, for my 4th year, I was presenting my paper on Objective Evaluation of Synthesised Environmental Sounds. WhatsApp Image 2018-09-06 at 15.00.44

 

The basic premis of the paper, is that we can computational measure how similar two sounds are using an objective metric. This objective metric can be evaluated using an iterative resynthesis approach. And a given similarity score can be evaluated through comparison to human perception.ObjectiveEvalSynthesis

 

 

I hope this made sense, but if not please get in touch and I would be happy to explain further. The paper will be available on the DAFx Website shortly.

Sound Synthesis – Are we there yet?

TL;DR. Yes

At the beginning of my PhD, I began to read the sound effect synthesis literature, and I quickly discovered that there was little to no standardisation or consistency in evaluation of sound effect synthesis models – particularly in relations to the sounds they produce. Surely one of the most important aspects of a synthetic system, is whether it can artifically produce a convincing replacement for what it is intended to synthesize. We could have the most intractable and relatable sound model in the world, but if it does not sound anything like it is intended to, then will any sound designers or end users ever use it?

There are many different methods for measuring how effective a sound synthesis model is. Jaffe proposed evaluating synthesis techniques for music based on ten criteria. However, only two of the ten criteria actually consider any sounds made by the synthesiser.

This is crazy! How can anyone know what synthesis method can produce a convincingly realistic sound?

So, we performed a formal evaluation study, where a range of different synthesis techniques where compared in a range of different situations. Some synthesis techniques are indistinguishable from a recorded sample, in a fixed medium environment. In short – Yes, we are there yet. There are sound synthesis methods that sound more realistic than high quality recorded samples. But there is clearly so much more work to be done…

For more information, read the paper here

360 Degree Sound Experience – Warsnare at the Albany

Having spent the past week working at the Albany in Deptford. We produced a 360 degree surround sound experience for Warsnare, a Deptford based DJ and producer https://soundcloud.com/warsnare.

27 Genelec Speakers and 4 subwoofers 5 different stages, with live performance spatialised and mixed in with pre-recorded and spatial elements were used to produce a fully immersive experience for a sold out audience, as such tickets to watch from the balcony were also produced.

ambisonic collage

Propellor Sound Synthesis at Audio Mostly

Recently, at the Audio Mostly 2017 conference, my work with Rod Selfridge and Josh Reiss was published on Propellor Sound Synthesis. I was both published at the conference, on the conference organising committee, as a the webmaster and a member of the music team. More information is available here on the Intelligent Sound Engineering Blog, and an example of the propellor synthesis is available on youtube.