This page lists workshops and research outputs from AIM CDT students and staff, organised by year. Click on a year to expand the list.

  • Yu C-Y, Fazekas G (2024). GOLF: A Singing Voice Synthesiser with Glottal Flow Wavetables and LPC Filters. Transactions of the International Society for Music Information Retrieval, 7(1), 316–330. QMRO: 103449 · DOI: 10.5334/tismir.210
  • Shatri E, Palavala KR (2024). Synthesising Handwritten Music with GANs: A Comprehensive Evaluation of CycleWGAN, ProGAN, and DCGAN. IEEE BigData 2024. QMRO: 101835 · DOI: 10.1109/bigdata62323.2024.10825834
  • Dixon S, Guinot J, Yusuf F et al. (2024). DMRN+19: Digital Music Research Network One-day Workshop 2024. QMRO: 104537
  • Williams A (2024). Model Pipelines for AI Music Artwork Generation.
  • Williams A (2024). Applications and Perspectives of AI in the Music Industry.
  • Reed CN, Benito AL, Caspe F et al. (2024). Shifting Ambiguity, Collapsing Indeterminacy: Designing with Data as Baradian Apparatus. ACM Transactions on Computer-Human Interaction, 31(6), 1–41. QMRO: 103012 · DOI: 10.1145/3689043
  • Pilataki M, Mauch M, Dixon S (2024). Pitch-aware generative pretraining improves multi-pitch estimation with scarce data. ACM International Conference on Multimedia in Asia. QMRO: 101160 · DOI: 10.1145/3696409.3700202
  • Vasilakis I, Bittner R, Pauwels J (2024). Evaluation of pretrained language models on music understanding. NLP4MusA. QMRO: 100852
  • Vanka S, Hannink L, Rolland J-B et al. (2024). Diff-MSTC: A Mixing Style Transfer Prototype for Cubase. ISMIR 2024. QMRO: 101492
  • Edwards D, Riley J, Sarmento P et al. (2024). MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling. ISMIR 2024. QMRO: 97939
  • Guinot J, Fazekas G, Quinton E (2024). Proceedings of the 25th International Society for Music Information Retrieval Conference. QMRO: 101165
  • Vanka S, Steinmetz C, Rolland J-B et al. (2024). Diff-MST: Differentiable Mixing Style Transfer. ISMIR 2024. QMRO: 98161
  • Vasilakis I, Bittner R, Pauwels J (2024). I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition. ISMIR 2024. QMRO: 99324 · DOI: 10.5281/zenodo.14877474
  • Riley J, Guo Z, Edwards D et al. (2024). GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model. ISMIR 2024. QMRO: 97938
  • Zhou Z, Wu Y, Wu Z et al. (2024). Can LLMs Reason in Music? An Evaluation of LLMs’ Capability of Music Understanding and Generation. ISMIR 2024. QMRO: 98625 · DOI: 10.48550/arxiv.2407.21531
  • Marinelli L, Lucht P (2024). A multimodal understanding of the role of sound and music in gendered toy marketing. PLOS ONE, 19(11). QMRO: 101429 · DOI: 10.1371/journal.pone.0311876
  • Steinmetz CJ, Singh S, Comunità M et al. (2024). ST-ITO: Controlling Audio Effects for Style Transfer with Inference-Time Optimization. QMRO: 98593 · DOI: 10.48550/arxiv.2410.21233
  • Williams A (2024). Using AI to Augment Creativity in Electronic Dance Music.
  • Bolt J, Pauwels J, Fazekas G (2024). Multi-Signal Informed Attention for Beat and Downbeat Detection. IEEE Internet of Sounds 2024. QMRO: 100025 · DOI: 10.1109/is262782.2024.10704128
  • Zheng S, Del Sette BM, Saitis C et al. (2024). Building Sketch-to-Sound Mapping with Unsupervised Feature Extraction and Interactive Machine Learning. NIME 2024. QMRO: 99264 · DOI: 10.5281/zenodo.13904959
  • Shier J, Saitis C, Robertson A et al. (2024). Real-time Timbre Remapping with Differentiable DSP. NIME 2024. QMRO: 101497
  • Yu C-Y, Mitcheltree C, Carson A et al. (2024). Differentiable All-pole Filters for Time-varying Audio Systems. DAFx 2024. QMRO: 97933
  • Yu C-Y, Fazekas G (2024). Differentiable Time-Varying Linear Prediction in the Context of End-to-End Analysis-by-Synthesis. Interspeech 2024. QMRO: 97836 · DOI: 10.21437/interspeech.2024-1187
  • Huang J, Benetos E (2024). Towards Building an End-to-End Multilingual Automatic Lyrics Transcription Model. EUSIPCO 2024. QMRO: 97337 · DOI: 10.23919/EUSIPCO63174.2024.10715045
  • Yuan R, Lin H, Wang Y et al. (2024). ChatMusician: Understanding and Generating Music Intrinsically with LLM. ACL 2024. QMRO: 97871 · DOI: 10.18653/v1/2024.findings-acl.373
  • Hamilton M, Clemente A, Hall E (2024). The Billboard Melodic Music Dataset (BiMMuDa). Transactions of the International Society for Music Information Retrieval, 7(1), 113–128. QMRO: 98991 · DOI: 10.5334/tismir.168
  • Weck B, Manco I, Benetos E et al. (2024). MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models. QMRO: 98705 · DOI: 10.48550/arxiv.2408.01337
  • Zhang H, Chowdhury S, Cancino-Chacón CE et al. (2024). DExter: Learning and Controlling Performance Expression with Diffusion Models. Applied Sciences, 14(15). QMRO: 101063 · DOI: 10.3390/app14156543
  • Morsi A, Zhang H, Maezawa A et al. (2024). Simulating Piano Performance Mistakes for Music Learning. Sound and Music Computing 2024. QMRO: 97746
  • Crocker R, Fazekas G (2024). Temporal Analysis of Emotion Perception in Film Music: Insights from the FME-24 Dataset. Sound and Music Computing 2024.
  • Riley J, Dixon S (2024). Reconstructing the Charlie Parker Omnibook using an audio-to-score automatic transcription pipeline. Sound and Music Computing 2024. QMRO: 97438
  • Ford C, Noel-Hirst A, Cardinale S et al. (2024). Reflection Across AI-based Music Composition. Creativity and Cognition 2024. QMRO: 97327 · DOI: 10.1145/3635636.3656185
  • Bryan-Kinns N, Ford C, Zheng S et al. (2024). Explainable AI for the Arts 2 (XAIxArts2). Creativity and Cognition 2024. DOI: 10.1145/3635636.3660763
  • Colton S, Bradshaw L, Banar B et al. (2024). Automatic Generation of Expressive Piano Miniatures. ICCC 2024. QMRO: 97861
  • Williams A, Tian H, Lattner S et al. (2024). Deep Learning-based Audio Representations for the Analysis and Visualisation of Electronic Dance Music DJ Mixes. AES International Symposium on AI and the Musician. QMRO: 104084
  • Graf M, Barthet M (2024). When XR Meets AI: Integrating Interactive Machine Learning with an XR Musical Instrument. AES International Symposium on AI and the Musician. QMRO: 97727
  • Jourdan T, Scurto H, Höök K (2024). First-person and second-person perspectives for ML in NIME. NIME 2024. QMRO: 97749
  • Li Y, Yuan R, Zhang G et al. (2024). MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training. ICLR 2024. QMRO: 95146
  • Luo Y-J, Ewert S, Dixon S (2024). Unsupervised Pitch-Timbre Disentanglement of Musical Instruments. ICASSP 2024. QMRO: 97925 · DOI: 10.1109/icassp48485.2024.10447564
  • Luo Y-J, Dixon S (2024). Posterior Variance-Parameterised Gaussian Dropout. ICASSP 2024. QMRO: 97926 · DOI: 10.1109/icassp48485.2024.10447835
  • Riley X, Edwards D, Dixon S (2024). High Resolution Guitar Transcription Via Domain Adaptation. ICASSP 2024. QMRO: 94701 · DOI: 10.1109/icassp48485.2024.10446182
  • Marinelli L, Saitis C (2024). Explainable Modeling of Gender-Targeting Practices in Toy Advertising Sound and Music. ICASSPW 2024. QMRO: 94522 · DOI: 10.1109/icasspw62465.2024.10669900
  • Pasini M, Grachten M (2024). Bass Accompaniment Generation via Latent Diffusion. ICASSP 2024. QMRO: 94512 · DOI: 10.48550/arXiv.2402.01412
  • Singh S, Steinmetz C, Benetos E et al. (2024). ATGNN: audio tagging graph neural network. IEEE Signal Processing Letters, 31, 825–829. QMRO: 93742 · DOI: 10.1109/LSP.2024.3352514
  • Hayes B, Shier J, Fazekas G et al. (2024). A review of differentiable digital signal processing for music and speech synthesis. Frontiers in Signal Processing, 3. QMRO: 93966 · DOI: 10.3389/frsip.2023.1284100
  • Bhandari K, Colton S (2024). Motifs, Phrases, and Beyond: The Modelling of Structure in Symbolic Music Generation. Lecture Notes in Computer Science, 14633, 33–51. QMRO: 94584 · DOI: 10.1007/978-3-031-56992-0_3

  • Ford C, Cardinale S, Wiggins G (2023). Three Open Questions for the Design of AI for Music Composition. CHIME One Day Workshop 2023. QMRO: 94219
  • Yuan R, Ma Y, Li Y et al. (2023). MARBLE: Music Audio Representation Benchmark for Universal Evaluation. QMRO: 93083 · DOI: 10.48550/arxiv.2306.10548
  • Manco I, Weck B, Doh S et al. (2023). The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation. QMRO: 93119 · DOI: 10.48550/arxiv.2311.10057
  • Tang J, Wiggins G, Fazekas G (2023). Reconstructing Human Expressiveness in Piano Performances with a Transformer Network. CMMR 2023. QMRO: 89722
  • Graf M, Barthet M (2023). Combining Vision and EMG-Based Hand Tracking for Extended Reality Musical Instruments. CMMR 2023. QMRO: 91305
  • Zhang H, Karystinaios E, Dixon S et al. (2023). Symbolic Music Representations for Classification Tasks: A Systematic Evaluation. ISMIR 2023, 848–858. QMRO: 105611
  • Riley X, Dixon S, Riley J (2023). FiloBass: A Dataset and Corpus Based Study of Jazz Basslines. ISMIR 2023. QMRO: 91033
  • Marinelli L, Fazekas G (2023). Gender-Coded Sound: Analysing the Gendering of Music in Toy Commercials via Multi-Task Learning. ISMIR 2023. QMRO: 91180
  • Vahidi C, Mitcheltree C, Lostanlen V (2023). Kymatio: Deep Learning meets Wavelet Theory for Music Signal Processing. ISMIR 2023. QMRO: 94546
  • Martelloni A, McPherson AP, Barthet M (2023). Real-time Percussive Technique Recognition and Embedding Learning for the Acoustic Guitar. ISMIR 2023. QMRO: 89568
  • Yu C-Y, Fazekas G (2023). Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables. ISMIR 2023. QMRO: 125531
  • Nolasco I, Singh S, Morfi V et al. (2023). Learning to detect an animal sound from five examples. Ecological Informatics, 77. QMRO: 90660 · DOI: 10.1016/j.ecoinf.2023.102258
  • Rice M, Steinmetz CJ, Fazekas G et al. (2023). General Purpose Audio Effect Removal. WASPAA 2023. QMRO: 91026 · DOI: 10.1109/waspaa58266.2023.10248157
  • Sarkar S, Thorpe L, Benetos E et al. (2023). Leveraging Synthetic Data for Improving Chamber Ensemble Separation. WASPAA 2023. QMRO: 89844 · DOI: 10.1109/waspaa58266.2023.10248118
  • Vahidi C, Singh S, Benetos E et al. (2023). Perceptual Musical Similarity Metric Learning with Graph Neural Networks. WASPAA 2023. QMRO: 90297 · DOI: 10.1109/waspaa58266.2023.10248151
  • Del Sette BM, Carnes D, Saitis C (2023). Sound of Care: Towards a Co-Operative AI Digital Pain Companion. CSCW 2023. QMRO: 92243 · DOI: 10.1145/3584931.3606971
  • Cardinale S, Cook M, Colton S (2023). AI-Driven Sonification of Automatically Designed Games. AIIDE 2023.
  • Williams A, Lattner S, Barthet M (2023). Sound-and-Image-informed Music Artwork Generation Using Text-to-Image Models. MuRS Workshop at RecSys 2023. QMRO: 91539
  • McIntosh TH, Woscholski O (2023). Affective Conditional Modifiers in Adaptive Video Game Music. Audio Mostly 2023. QMRO: 90808 · DOI: 10.1145/3616195.3616222
  • Caspe F, McPherson A, Sandler M (2023). FM Tone Transfer with Envelope Learning. Audio Mostly 2023. QMRO: 90812 · DOI: 10.1145/3616195.3616196
  • Bolt J, Fazekas G (2023). Supervised Contrastive Learning For Musical Onset Detection. Audio Mostly 2023. DOI: 10.1145/3616195.3616215
  • Wolstanholme L, Vahidi C, McPherson A (2023). Hearing from Within a Sound. AES International Conference on Spatial and Immersive Audio. QMRO: 90663
  • Ma Y, Yuan R, Li Y et al. (2023). On the Effectiveness of Speech Self-supervised Learning for Music. QMRO: 90410 · DOI: 10.48550/arxiv.2307.05161
  • Loth J, Mamou-Mani A (2023). Playing Style Affects Steel-String Acoustic Guitar Timbre. 3rd International Conference on Timbre. QMRO: 89318
  • Banar B, Bryan-Kinns N (2023). A Tool for Generating Controllable Variations of Musical Themes. AAAI 2023, 37(13), 16401–16403. QMRO: 89922 · DOI: 10.1609/aaai.v37i13.27059
  • Bryan-Kinns N, Ford C, Chamberlain A et al. (2023). Explainable AI for the Arts: XAIxArts. Creativity and Cognition 2023. DOI: 10.1145/3591196.3593517
  • Cardinale S, Colton S (2023). Neuro-Symbolic Composition of Music with Talking Points. ICCC 2023. QMRO: 102963
  • Riley X, Dixon S, Riley J (2023). CREPE NOTES: A New Method for Segmenting Pitch Contours into Discrete Notes. Sound and Music Computing 2023. QMRO: 89144
  • Yu C-Y, Yeh S-L, Fazekas G et al. (2023). Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution. ICASSP 2023. QMRO: 105032 · DOI: 10.1109/icassp49357.2023.10095103
  • Zhang H, Dixon S (2023). Disentangling the Horowitz Factor. ICASSP 2023. QMRO: 88841 · DOI: 10.1109/icassp49357.2023.10095009
  • Comunità M, Steinmetz CJ, Phan H et al. (2023). Modelling Black-Box Audio Effects with Time-Varying Feature Modulation. ICASSP 2023. QMRO: 90398 · DOI: 10.1109/icassp49357.2023.10097173
  • Diaz R, Hayes B, Saitis C et al. (2023). Rigid-Body Sound Synthesis with Differentiable Modal Resonators. ICASSP 2023. QMRO: 88329 · DOI: 10.1109/icassp49357.2023.10095139
  • Hayes B, Saitis C (2023). Sinusoidal Frequency Estimation by Gradient Descent. ICASSP 2023. QMRO: 94885 · DOI: 10.1109/icassp49357.2023.10095188
  • Diaz Fernandez R, Saitis C, Sandler M (2023). Interactive Neural Resonators. NIME 2023. QMRO: 97841 · DOI: 10.5281/zenodo.11189296
  • Pelinski Ramos T, Diaz Fernandez R, Benito Temprano AL et al. (2023). Pipeline for recording datasets and running neural networks on the Bela embedded hardware platform. NIME 2023. QMRO: 88693 · DOI: 10.5281/zenodo.11189141
  • Mitcheltree C, Steinmetz CJ, Comunita M et al. (2023). Modulation Extraction for LFO-driven Audio Effects. DAFx 2023. QMRO: 88476
  • Vanka S, Safi M, Rolland J-B et al. (2023). Adoption of AI Technology in the Music Mixing Workflow. AES Europe Convention 2023. QMRO: 86042
  • Ford C, Bryan-Kinns N (2023). Towards a Reflection in Creative Experience Questionnaire. CHI 2023. QMRO: 84900 · DOI: 10.1145/3544548.3581077
  • Steinmetz C, Hawley S (2023). Leveraging Neural Representations for Audio Manipulation. 154th AES Convention. QMRO: 91027
  • Vahidi C, Han H, Wang C et al. (2023). Mesostructures: Beyond Spectrogram Loss in Differentiable Time-Frequency Analysis. QMRO: 88712 · DOI: 10.48550/arxiv.2301.10183
  • Edwards D, Dixon S, Benetos E (2023). PiJAMA: Piano Jazz with Automatic MIDI Annotations. Transactions of the International Society for Music Information Retrieval, 6(1), 89–102. QMRO: 91025 · DOI: 10.5334/tismir.162

  • Grechin S, Banar B, Hayes B et al. (2021). DMRN+16: Digital Music Research Network One-day Workshop 2021. QMRO: 76887
  • Steinmetz C, Reiss J (2021). Steerable discovery of neural audio effects. NeurIPS ML for Creativity and Design Workshop. QMRO: 80282
  • Bryan-Kinns N, Banar B, Ford C et al. (2021). Exploring XAI for the Arts: Explaining Latent Space in Generative Music. NeurIPS 2021 XAI Workshop. QMRO: 77565
  • Liu L, Morfi V, Benetos E (2021). ACPAS: A Dataset of Aligned Classical Piano Audio and Scores. ISMIR 2021 Late-Breaking Demo. QMRO: 79136
  • Comunità M, Stowell D, Reiss JD (2021). Guitar Effects Recognition and Parameter Estimation With Convolutional Neural Networks. Journal of the Audio Engineering Society, 69(7/8), 594–604. QMRO: 75968 · DOI: 10.17743/jaes.2021.0019
  • Foster D, Dixon S (2021). Filosax: A Dataset of Annotated Jazz Saxophone Recordings. ISMIR 2021. QMRO: 75794
  • Sarmento P, Carr CJ, Zukowski Z et al. (2021). DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models. ISMIR 2021. QMRO: 75792
  • Hayes B, Saitis C (2021). Neural Waveshaping Synthesis. ISMIR 2021. QMRO: 73126
  • Steinmetz CJ, Reiss JD (2021). WaveBeat: End-to-end beat and downbeat tracking in the time domain. 151st AES Convention. QMRO: 74638
  • Temprano ALB, McPherson A (2021). A TMR Angle Sensor for Gesture Acquisition and Disambiguation on the Electric Guitar. Audio Mostly 2021. QMRO: 74325 · DOI: 10.1145/3478384.3478427
  • Sarkar S, Benetos E, Sandler M (2021). Vocal Harmony Separation Using Time-Domain Neural Networks. Interspeech 2021. QMRO: 72826 · DOI: 10.21437/interspeech.2021-1531
  • Shatri E (2021). DoReMi: First glance at a universal OMR dataset. 3rd International Workshop on Reading Music Systems. DOI: 10.48550/arXiv.2212.00378
  • Vahidi C, Saitis C (2021). A Modulation Front-End for Music Audio Tagging. IJCNN 2021. QMRO: 72389 · DOI: 10.1109/ijcnn52387.2021.9533547
  • Graf M, Opara HC, Barthet M (2021). An Audio-Driven System for Real-Time Music Visualisation. 150th AES Convention. QMRO: 72837
  • Banar B, Colton S (2021). Generating Music with Extreme Passages using GPT-2. Evo* 2021. QMRO: 73210
  • Ford C, Bryan-Kinns N, Nash C (2021). Creativity in Children’s Digital Music Composition. NIME 2021. QMRO: 72722 · DOI: 10.21428/92fbeb44.e83deee9
  • Liu L, Morfi V, Benetos E (2021). Joint Multi-Pitch Detection and Score Transcription for Polyphonic Piano Music. ICASSP 2021. QMRO: 70432 · DOI: 10.1109/icassp39728.2021.9413601
  • Singh S, Bear H, Benetos E (2021). Prototypical Networks for Domain Adaptation in Acoustic Scene Classification. ICASSP 2021. QMRO: 70431 · DOI: 10.1109/ICASSP39728.2021.9414876
  • Zhang Y, Xia G, Levy M et al. (2021). COSMIC: A Conversational Interface for Human-AI Music Co-Creation. NIME 2021. QMRO: 73471 · DOI: 10.21428/92fbeb44.110a7a32
  • Manco I, Benetos E, Fazekas G (2021). MusCaps: Generating Captions for Music Audio. QMRO: 72068 · DOI: 10.48550/arxiv.2104.11984
  • Steinmetz CJ, Reiss JD (2021). Pyloudnorm: A simple yet flexible loudness meter in python. 150th AES Convention. QMRO: 80278
  • Liu L, Benetos E (2021). From Audio to Music Notation. Handbook of Artificial Intelligence for Music. Springer Nature. QMRO: 73211 · DOI: 10.1007/978-3-030-72116-9_24

  • Liu L, Morfi G-V, Benetos E (2020). Joint Piano-roll and Score Transcription for Polyphonic Piano Music. DMRN+15. QMRO: 70433
  • Steinmetz CJ, Reiss JD (2020). Randomized Overdrive Neural Networks. 4th Workshop on Machine Learning for Creativity and Design at NeurIPS 2020. QMRO: 80279
  • Sarmento P, Holmqvist O (2020). Musical Smart City: Perspectives on Ubiquitous Sonification. Ubiquitous Music Workshop. QMRO: 65398
  • Zacharakis A, Hayes B, Saitis C (2020). Evidence for timbre space robustness to an uncontrolled online stimulus presentation. Timbre 2020, 129–132. QMRO: 69763
  • Hayes B, Saitis C (2020). There’s more to timbre than musical instruments: semantic dimensions of FM sounds. Timbre 2020, 69–72. QMRO: 69762
  • Vahidi C, Fazekas G, Saitis C (2020). Timbre Space Representation of a Subtractive Synthesizer. Timbre 2020. QMRO: 67583
  • Martelloni A, McPherson A, Barthet M (2020). Percussive Fingerstyle Guitar through the Lens of NIME: an Interview Study. NIME 2020. QMRO: 66940
  • Turchet L, Fazekas G, Lagrange M et al. (2020). The Internet of Audio Things: State of the Art, Vision, and Challenges. IEEE Internet of Things Journal, 7(10), 10233–10249. QMRO: 68012 · DOI: 10.1109/jiot.2020.2997047
  • Shatri E, Fazekas G (2020). Optical Music Recognition: State of the Art and Major Challenges. International Conference on Technologies for Music Notation and Representation. QMRO: 67582
  • Ycart A, Liu L, Benetos E et al. (2020). Investigating the Perceptual Validity of Evaluation Metrics for Automatic Piano Music Transcription. Transactions of the International Society for Music Information Retrieval, 3(1), 68–81. QMRO: 65069 · DOI: 10.5334/tismir.57