Keynotes and Special Session

Don’t miss our exciting special Danish Sound Cluster session with Niels H. Pontoppidan, and Clément Laroche, as well as the keynotes from Hanna Järveläinen, and Mitsuko Aramaki. Prepare to be inspired as these four remarkable speakers take the stage.

05/09 Special Session - Advances, Barriers, and Future Direction for Hearing Aid Effects

Niels H. Pontoppidan – Danish Sound Cluster

During the last 20 years advanced applications for hearing instruments only possible with machine learning (ML) emerged. However, for many and for long the computational complexity did not allow for actual implementation. Nevertheless, in 2020 core signal processing based on ML principles came in use for enhancing speech in the presence of noise. It is interesting to look back at the interplay of applications, algorithms, connectivity, and hardware to speculate about the next core signal processing areas in hearing instrument that ML will enhance.

05/09 Special Session - Leveraging Deep Learning for Enhanced Signal Processing in Telecommunication Devices: A Step towards Futuristic Audio

Clément Laroche – Jabra

With the rapid evolution of artificial intelligence and deep learning algorithms, it’s essential to discern their transformative impacts on telecommunication devices’ audio performance, specifically headsets, speakerphones, and videobars. This presentation will start by delineating the challenges faced by conventional signal processing techniques in the current digital age, such as the inability to effectively filter ambient noises in varying environments or adapt to different speech characteristics. We then explore how can deep learning-based approaches aid in overcoming these challenges by learning complex, non-linear relationships from vast audio data. However, the computational demand and memory footprint of such advanced models can often pose a challenge for their deployment on resource-constrained embedded devices. Limitations imposed by the computational and memory requirements of deep learning models, highlighting the importance of model optimization for their practical use in real-time telecommunication devices. Spotlight will be put on dynamic neural networks, a compelling concept that allows ‘early exiting’ from computations. This approach facilitates rapid decisions when the network encounters less complex tasks, thus conserving computational resources — a valuable attribute for real-time applications on embedded devices. In addition to the technical aspect, we believe in the invaluable role of human listeners in validating our models. Hence, we will share results from a study conducted on Amazon’s Mechanical Turk platform, where a diverse crowd sourced the rating of audio quality. The insights gathered from these human ratings provided a more nuanced understanding of the perceived audio quality, underlining the importance of a human-centric approach in our technical advancements.

06/09 Keynote 1 - Sustainable Perceptual Evaluation for Digital Audio Applications: Motivation, Methods, and Best Practice

Hanna Järveläinen – Research Associate Institute for Computer Music and Sound Technology (ICST)

Listener tests are a standard, partly even standardized, procedure in the development of new audio technology. This kind of testing is often carried out to measure the degradation of perceived audio quality. Experimental research with participants is also performed in basic and applied psychoacoustic research, user experience studies with digital musical instruments and interfaces, and in designing applications for special groups.

Within the DAFx community, methods from the audio testing field are presently most widely used. However, long-term development requires considering the end user both in passive perception and active interaction, often in multisensory, ecological, or creative settings. The presentation will discuss state-of-the art procedures and analysis methods that could contribute to digital audio research in this challenging environment.

Bio:
Hanna Järveläinen received the D.Sc. degree in electronics and communications engineering from Helsinki University of Technology (now Aalto University), and a B.Mus. degree from the Sibelius Academy (Uniarts Helsinki), Finland. In 2012, after focusing on music studies and then working as a musician for several years, she joined the Institute for Computer Music and Sound Technology at Zurich University of the Arts, Switzerland, where her main research field is auditory and multimodal perception and action.

07/09 Keynote 2 - Linking Sound, Morphology and Perception: Towards a Language of Sound

Mitsuko Aramaki – PRISM Laboratory, CNRS, Aix-Marseille University

The Sciences of Sound and Music have considerably grown notably thanks to the development of digital audio processing. Today, we have numerous synthesis tools, models and methods for sound creation to generate sounds of impressive realism. However, the perceptual control of these synthesis processes remains a current challenge. Based on the ecological approach to perception, a synthesis control paradigm enabling the creation of sounds from evocations has been proposed and led to the development of environmental sound synthesizers which can be directly controlled from perceptual attributes. In this talk, we will present the methodology through a series of perceptual studies to better understand the relationship between sound morphology and human perception for synthesis purposes with an interdisciplinary perspective.

Bio:
Mitsuko Aramaki received her PhD from Aix-Marseille University (France) in 2003, for her work on analysis and synthesis of impact sounds using physical and perceptual approaches. She is currently Director of Research at the National Center for Scientific Research (CNRS). Since 2017, she is head of the “Perception Engineering” team at the laboratory PRISM “Perception, Representations, Image, Sound, Music”. Her scientific career has led to explore different fields of research from acoustics to cognitive science. Her research mainly focuses on sound modeling, perceptual and cognitive aspects of timbre, and more recently, on multimodal interactions in the context of virtual/augmented reality. She is member of the steering committee of the CMMR “Computer Music Multidisciplinary Research” conferences.