Classification of the perceptual impression of source-level blending between violins in a joint performance

Jithin Thilakan; Balamurali BT; Jer-Ming Chen; Malte Kob

doi:10.1051/aacus/2023050

All issues

Volume 7 (2023)

Acta Acust., 7 (2023) 62

Abstract

Open Access

Issue		Acta Acust. Volume 7, 2023


Article Number		62
Number of page(s)		14
Section		Musical Acoustics
DOI		https://doi.org/10.1051/aacus/2023050
Published online		30 November 2023

Acta Acustica 2023, 7, 62

Scientific Article

Classification of the perceptual impression of source-level blending between violins in a joint performance

Jithin Thilakan¹, Balamurali BT², Jer-Ming Chen² and Malte Kob¹^*

¹ Erich Thienhaus Institute, Detmold University of Music, Detmold, Germany
² Singapore University of Technology and Design, Singapore

^* Corresponding author: malte.kob@hfm-detmold.de

Received: 17 May 2023
Accepted: 18 September 2023

Abstract

Quantifying auditory perception of blending between sound sources is a relevant topic in music perception, but remains poorly explored due to its complex and multidimensional nature. Previous studies were able to explain the source-level blending in musically constrained sound samples, but comprehensive modelling of blending perception that involves musically realistic samples was beyond their scope. Combining the methods of Music Information Retrieval (MIR) and Machine Learning (ML), this investigation attempts to classify sound samples from real musical scenarios having different musical excerpts according to their overall source-level blending impression.

Monophonically rendered samples of 2 violins in unison, extracted from in-situ close-mic recordings of ensemble performance, were perceptually evaluated and labeled into blended and non-blended classes by a group of expert listeners. Mel Frequency Cepstral Coefficients (MFCCs) were extracted, and a classification model was developed using linear and non-linear feature transformation techniques adapted from the dimensionality reduction strategies such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and t-Stochastic Neighbourhood Embedding (t-SNE), paired with Euclidean distance measure as a metric to evaluate the similarity of transformed feature clusters. Results showed that LDA transformed raw MFCCs trained and validated using a separate train-test data set and Leave-One-Out Cross-Validation (LOOCV) resulted in an accuracy of 87.5%, and 87.1% respectively in correctly classifying the samples into blended and non-blended classes. In this regard, the proposed classification model which incorporates “ecological” score-independent sound samples without requiring access to individual source recordings advances the holistic modeling of blending.

Key words: Musical blending / MIR / MFCC / Dimensionality reduction / LDA / Music perception

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.