AIM PhD student wins best paper award at AES AIMLA 2025

We are delighted to share that Mary Pilataki, a PhD student at AIM, has received the Best Paper Award at the Audio Engineering Society International Conference on Artificial Intelligence and Machine Learning for Audio (AES AIMLA) 2025.

The awarded paper, “Extraction and Neural Synthesis of Low-Order Spectral Components for Head-Related Transfer Functions”, presents research carried out during her internship at PlayStation London in collaboration with Cal Armstrong and Chris Buchanan. The study explores how deep learning can separate sound colouration from spatial cues in Head-Related Transfer Functions (HRTFs), opening up new possibilities for creating more personalised and immersive 3D audio experiences.

The full paper is available through the AES E-Library.


AIM/C4DM team wins Query-by-Vocal Imitation Challenge

Congratulations to AIM members Aditya Bhattacharjee and Christos Plachouras, and C4DM member Sungkyun Chang who secured first place at the Query-by-Vocal Imitation (QbVI) Challenge, held as part of the AES International Conference on Artificial Intelligence and Machine Learning for Audio conference (AES AIMLA 2025) taking place from September 8-10, 2025.

The winning entry addressed the task, which entails retrieving relevant audio clips from a database using only a vocal imitation as a query. This is a particularly complex problem due to the variability in how people vocalise sounds and the acoustic diversity across sound categories. Successful approaches must bridge the gap between vocal and non-vocal audio, while handling the unpredictability of human-generated imitations.

The team’s submission, titled “Effective Finetuning Methods for Query-by Vocal Imitation”, advances the state-of-the-art in QbVI by integrating a triplet-based regularisation objective with supervised contrastive learning. This method addresses the issue of limited data by sampling from an unused subset of the VocalSketch dataset, which comprises practice recordings and human-rejected vocal imitations. While this data may not be suitable for positive matches, the vocal imitation data is useful for creating confounding examples during training. More specifically, this increases the size of the pool of negative examples, which is utilised by the added regularisation method.

The proposed method surpassed state-of-the-art methods for both subjective and objective evaluation metrics, opening up scope for product-based innovations and software tools that can be used by artists to effectively search large repositories of sound effects.


AIM at ISMIR 2025

ISMIR 2025 logoOn 21-25 September 2025, several AIM researchers will participate at the 26th International Society for Music Information Retrieval Conference (ISMIR 2025). ISMIR is the leading conference in the field of music informatics, and is currently the top-cited publication for Music & Musicology (source: Google Scholar). This year ISMIR will take place onsite in Daejeon, Korea.

Similar to previous years, AIM will have a strong presence at ISMIR 2025.

In the Scientific Programme, the following papers are authored/co-authored by AIM members:

The following Tutorials will be co-presented by AIM PhD students Rodrigo Diaz and Julien Guinot:

  • Differentiable Physical Modeling Sound Synthesis: Theory, Musical Application, and Programming (Jin Woo Lee, Stefan Bilbao, Rodrigo Diaz)
  • Self-supervised Learning for Music – An Overview and New Horizons (Julien Guinot, Alain Riou, Yuexuan Kong, Marco Pasini, Gabriel Meseguer-Brocal, Stefan Lattner)

The following journal papers published at TISMIR which are co-authored by AIM members will be presented at the conference:

As part of the MIREX public evaluations:

Finally, on the organisational side:

See you at Daejeon!