Replicating Human Sound Localization with a Multi-Layer Perceptron

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

One of the key capabilities of the human sense of hearing is to determine the direction from which a sound is emanating, a task known as localization. This paper describes the derivation of a machine learning model which performs the same localization task: Given an audio waveform which arrives at the listener's eardrum, determine the direction of the audio source. Head-related transfer functions (HRTFs) from the ITA-HRTF database of 48 individuals are used to train and validate this model. A series of waveforms is generated from each HRTF, representing the sound pressure level at the listener's eardrums for various source directions. A feature vector is calculated for each waveform from acoustical properties motivated by prior literature on sound localization; these feature vectors are used to train multi-layer perceptrons (MLPs), a form of artificial neural network, to replicate the behavior of single individuals. Data from three individuals are used to optimize hyperparameters of both the feature extraction and MLP stages for model accuracy. These hyperparameters are then validated by training and analyzing models for all 48 individuals in the database. The errors produced by each model fall in a log-normal distribution. The median model is capable of identifying, with 95% confidence, the sound source direction to within 20 degrees. This result is comparable to previously-reported human capabilities and thus shows that an MLP can successfully replicate the human sense of sound localization.

Original languageEnglish
Title of host publicationSMC/JIM/IFC 2022 - Proceedings of the 19th Sound and Music Computing Conference
EditorsRomain Michon, Laurent Pottier, Yann Orlarey
PublisherSound and Music Computing Network
Pages271-278
Number of pages8
ISBN (Electronic)9782958412609
Publication statusPublished - 2022
Event19th Sound and Music Computing Conference, SMC 2022 - Saint-Etienne, France
Duration: 5 Jun 202212 Jun 2022

Publication series

NameProceedings of the Sound and Music Computing Conferences

Conference

Conference19th Sound and Music Computing Conference, SMC 2022
Country/TerritoryFrance
CitySaint-Etienne
Period5/06/2212/06/22

Bibliographical note

Funding Information: This work was performed in the Center of Excellence (CoE) Research on AI-and Simulation-Based Engineering at Ex-ascale (RAISE) receiving funding from EU’s Horizon 2020 Research and Innovation Framework Programme H2020-INFRAEDI-2019-1 under grant agreement no. 951733. Funding Information: Icelandic HPC Competence Center is funded by the Eu-roCC project that has received funding from the European HPC Joint Undertaking (JU) under grant agreement No 951732. The JU receives support from the EU’s Horizon 2020 research and innovation programme. Funding Information: The work received support from the NordForsk’s Nordic Sound and Music Computing Network (NordicSMC), project number 86892. Publisher Copyright: Copyright: © 2022 Eric Michael Sumner et al.

Fingerprint

Dive into the research topics of 'Replicating Human Sound Localization with a Multi-Layer Perceptron'. Together they form a unique fingerprint.

Cite this