2022 IEEE ICASSP Grand Challenge 

L3DAS22: Machine Learning for 3D Audio Signal Processing

Signal Processing Grand Challenge at IEEE ICASSP 2022

Scope of the Challenge

The L3DAS22 Challenge aims at encouraging and fostering research on machine learning for 3D audio signal processing.

3D audio is gaining increasing interest in the machine learning community in recent years. The range of applications is incredibly wide, extending from virtual and real conferencing to autonomous driving, surveillance and many more. In these contexts, a fundamental procedure is to properly identify the nature of events present in a soundscape, their spatial position and eventually remove unwanted noises that can interfere with the useful signal. To this end, L3DAS22 Challenge presents two tasks: 3D Speech Enhancement and 3D Sound Event Localization and Detection, both relying on first-order Ambisonics recordings in reverberant office environments.

Each task involves 2 separate tracks: 1-mic and 2-mic recordings, respectively containing sounds acquired by one 1st order Ambisonics microphone and by an array of two ones. The use of two Ambisonics microphones represents one of the main novelties of the L3DAS22 Challenge. We expect higher accuracy/reconstruction quality when taking advantage of the dual spatial perspective of the two microphones. Moreover, we are very interested in identifying other possible advantages of this configuration over standard Ambisonics formats.

Schedule

▪ Nov 22, 2021 - Release of the Training and Development Sets, Code, Baseline Methods and Documentation
Dec 15, 2021  Jan 5, 2022 – Release of the Evaluation Test Set
▪ Jan 5, 2022 – Registration Closing
Dec 22, 2021  Jan 10, 2022 – Deadline for Submitting Results for Both Tasks
Jan 5, 2022  Jan 20, 2022 – Notification of Top Ranked Teams
▪ Jan 20, 2022  Jan 31, 2022 – Deadline for Paper Submission (Top Ranked 5 Only)
Feb 10, 2022 – Grand Challenge Paper Acceptance Notification
Feb 16, 2022 – Camera-Ready Grand Challenge Papers Deadline
▪ May 7, 2022 –  Virtual session at IEEE ICASSP 2022 (link )
▪ May 22, 2022 – Opening of the IEEE ICASSP 2022 and Winner Announcement

Tasks

The tasks we propose are:

 3D Speech Enhancement

The objective of this task is the enhancement of speech signals immersed in the spatial sound field of a reverberant office environment. Here the models are expected to extract the monophonic voice signal from the 3D mixture containing various background noises. The evaluation metric for this task is a combination of short-time objective intelligibility (STOI) and word error rate (WER).

 More details

  3D Sound Event Localization and Detection

The aim of this task is to detect the temporal activities of a known set of sound event classes and, in particular, to further locate them in the space. Here the models must predict a list of the active sound events and their respective location at regular intervals of 100 milliseconds. Performance on this task is evaluated according to the location-sensitive detection error, which joins the localization and detection error metrics.

 More details

Dataset

The L3DAS22 dataset contains multiple-source and multiple-perspective B-format Ambisonics audio recordings. We sampled the acoustic field of a large office room, placing two first-order Ambisonics microphones in the center of the room and moving a speaker reproducing the analytic signal in 252 fixed spatial positions. 
We aimed at creating plausible and variegate 3D scenarios to reflect possible real-life situations in which sound and disparate types of background noises coexist in the same 3D reverberant environment.

 More details on the dataset

Prizes and Benefits for Challenge Winners

  • Prizes will be awarded to the challenge winners thanks to the support of Kuaishou Technology.
  • Top 5 ranked teams can submit a regular paper according to the ICASSP guidelines.

Additional Info

  • Registration
    Participants are required to register for the challenge by compiling this form.
  • Papers on the ISCA Archive
    L3DAS22 Challenge has been endorsed by the International Speech Communication Association (ISCA) which has given the possibility to all the participants to publish a paper on the ISCA Archive with a regular DOI.
  • Interactive Demos on Replicate
    All participants will have the possibility to upload an interactive demo of their models on Replicate. After registration we will provide an invitation link to create a Replicate account. You can find some guidelines about how to do this here. Moreover, you can have a look at the L3DAS22 baselines interactive demo on Replicate, based on our cog predict script and configuration file.
  • Previous Challenge: L3DAS21
    This challenge is an enhanced version of the L3DAS21 Challenge. We improved many aspects of the dataset and the code that make it possible to run experiments more fluently and with lower resources demand. For additional info you can also refer to the L3DAS21 challenge website, the official GitHub repository and the official MLSP paper describing the L3DAS21 dataset.

Challenge Partners

Optimize your work to save as much energy as possible.