Issue |
Acta Acust.
Volume 9, 2025
|
|
---|---|---|
Article Number | 17 | |
Number of page(s) | 14 | |
Section | Auditory Quality of Systems | |
DOI | https://doi.org/10.1051/aacus/2024090 | |
Published online | 03 March 2025 |
Audio Article
Loudness matching of complex tones simulating sounds from electric trucks
Engineering Acoustics, Luleå University of Technology, 97187 Luleå, Sweden
* Corresponding author: birgitta.nyman@associated.ltu.se;
birgitta.nyman@scania.com
Received:
24
May
2024
Accepted:
16
December
2024
With electric powertrains quickly advancing in the heavy vehicle sector, there is an increasing interest in the industry to find a general practice for evaluating tonal sounds. The challenge to set requirements is complex. Tonal sounds span from extremely annoying to pleasant. Established methods for prediction of tonal magnitude typically estimate individual tonal components without considering interrelations between the tones. In this study, the loudness perception of continuous complex tones with increasing number of harmonics as well as non-harmonic tone components, is assessed using pink noise as reference. Frequencies studied cover 350–11 000 Hz. These frequencies typically occur in electrified trucks, hitting the most sensitive area of the human hearing. The results show a statistically significant positive linear relationship between perceived loudness and increasing number of harmonics, even with decreasing level of amplitude (−6 dB/oct). Significant differences are seen between harmonic and non-harmonic tonal signals, when the second partial is detuned. Increasing the number of tonal components increases the perceived loudness linearly. Non-harmonic complex tonal sounds are assessed less loud than the corresponding harmonic sounds. In case of complex tonal sounds, models of loudness estimation need to take the number of tone components and their frequency ratios into account.
Key words: Tonal content / Perception of complex tones / Loudness estimation / Electric vehicles
© The Author(s), Published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abbreviations
AFC: = alternative forced choice
MOTC: = perceived magnitude of tonal content
PSE: = point of subjective equality
1 Introduction
With electric powertrains quickly advancing on the market, there is an increasing interest in finding a general practice for evaluation and target setting of the novel sounds that these vehicles produce. A typical feature is tonal sounds. So far, a lot of studies have been made on passenger cars, but less on heavy vehicles. Truck sounds include several tone generating components, in addition to the electric powertrain. Fans and pumps belong to this category and they often generate more or less continuous sounds with prominent complex tones. Together with other sounds simultaneously produced, they present a complex symphony, where the components are not necessarily harmonically related to each other (see Figure 1). The aim of this study is to understand the perception of loudness of complex tones, including high frequency content, and how it is affected by harmonicity, in order to eventually set requirements for components generating these kinds of sounds and ensure good sound quality both externally and internally of the vehicle.
![]() |
Figure 1 Frequency analysis of external noise recording of an electric truck at close range, driving at 20 km/h. The sound contains several strong tonal components Audio_file_1.mp3 . |
1.1 Established methods
Psychoacoustic models evaluating the audibility of tonal sounds, are often based on the concept of critical bandwidth [1]. Grossly simplified, there are two different approaches. Either each single tone component (peak) is evaluated one by one [2–4], or the total tonal content is evaluated with a holistic approach, where the identified tones are weighed together [5, 6].
These methods can be divided into three different categories:
-
(a)
The tone vs. background noise in the same critical band [2–4].
-
(b)
The level of the critical band containing the tone compared to the levels of the two adjacent critical bands [2, 7].
-
(c)
The evaluation of the total tonal content compared to the background level [5, 6, 8].
In summary, established methods do not consider the unison of different tone components that occur when listening to a complex tone. The frequency relation between the tone components is not considered. However, the human hearing evaluates tone components simultaneously and is affected by their interrelation [6, 9].
1.2 Characterization of the perception of tonal sounds
A “tone” often refers to a peak in the frequency domain and its energy within a critical bandwidth [10], with the frequency of the peak as its center frequency. The term “tone” is also used to describe complex tones, which consist of several components with a common fundamental frequency deciding the pitch [9]. Periodic complex tones are generally perceived as a single unit and not as separate components, characterized by pitch and timbre [11]. This holds true unless one component is much stronger than the other, widely separated, or separately modulated [9, 12]. The human hearing analyses complex tones in two steps: frequency determination of the components and “pitch pattern recognition”, which means finding the best fitting fundamental of the determined components [13]. The model for “pitch pattern recognition” suggests that we learn from an early age that pure harmonic combinations belong together. Terhardt [14] urges that it is a by-product of speech perception. According to Terhardt [13], we store harmonic templates for different harmonic spectra in our memory. Another model for this concept is based on neural firing patterns or periodicity of the waveform. Frequency is represented in the activity across neurons with different frequency characteristics in the cochlea. Harmonics are synchronized and, therefore, will follow the same timing. Non-harmonic components will be asynchronous and thereby segregated from the rest [12, 15].
Different terms used to describe the characterization of the perception of tonal sounds are plural. To mention some, there are tonalness, tonality, pitch strength, and magnitude of tonal content. What does make this area even more complicated is that different authors use the terms in different ways, causing conceptual confusion. According to Hots et al. [16] “tonality is defined as perception of tonal components in noise”. This is indeed a conventional description, adding that some investigations, focusing on the perception of tonal sounds, differentiate between the sensation of tonality strength [16, 17] and the sensation of loudness of the tonal content [16, 18]. Magnitude of the tonal content is according to Hansen and Weber [18] related to the sensation of perceived loudness or intensity for the tonal portion presented in noise. The same term is used by Doleschal et al. [17] to describe how tonal a sound is perceived. A hypothesis is that the sensation of tonality and sensation of loudness of the tonal content correlate [16–19]. Tonalness [5] or pitch strength [20] is defined as the sensation of tonal strength of the whole sound, where a pure tone has a very strong pitch strength and noise a faint or no pitch strength.
1.3 Estimating magnitude of tonal content
It has long been known that the loudness perception of complex tones depends on the frequency spacing between the components [21, 22]. Comparing a single tone at 1000 Hz to a four-tone complex (equally intense components) centered around 1000 Hz having the same SPL as the single tone, show that when the tone complex spans more than the critical bandwidth centered at 1000 Hz, the perceived loudness will increase even if the sound pressure level remains the same [21]. Zwicker et al. also showed that “uniform spacing produces greater loudness than nonuniform spacing.” In practice, uniform spacing results in harmonically dependent components and nonuniform spacing non-harmonically related components. Scharf [22] showed that tone complexes with a spectral width larger than the critical bandwidth with a flat spectrum is perceived louder than tone complexes with a non-flat spectrum. The results of Zwicker et al. [21] and Scharf [22] proved that loudness perception of complex tones depends on harmonicity, the frequency spacing between the tone components, and the shape of the spectral envelope. However, more research is needed to better understand how different parameters, for example the fundamental frequency, the number of tone components, amplitudes, and harmonicity, affect loudness perception of complex tones.
More recent studies on the subject of tonal sounds include quantification of the perception of dynamic tonal sounds. By synthesizing different driving conditions Doleschal et al. [17] created sounds resembling real recordings from electric vehicles by changing level, background noise, and frequency. Dynamic two-component complex tonal sounds were presented as linear sweeps and the absolute magnitude estimation of tonality (rating: not tonal – extremely tonal) was measured. The authors call this “Perceived magnitude of tonal content” (MOTC). The results show that the presence of two harmonics (or in this case, two successive engine orders) increases the sensation of MOTC. A follow up study by Doleschal et al. [23] included rating of pleasantness in addition to tonal content perception. Adding subharmonics and low background noise level increased the pleasantness of the sound. The most tonal sound was also rated as the most pleasant, which contradicts the general assumption that highly tonal sounds by default evoke annoyance. Adding an overtone to the subharmonic cases, as well as increasing the background level, lowered the pleasantness. The condition with an unbroken sequence of five added subharmonics, with the level of harmonics decreasing with 6 dB per octave, was rated as most pleasant. One explanation is that this condition also resulted in the lowest sense of pitch.
Vormann et al. [24] showed that the perceived magnitude of the tonal content, presented in uniform exciting noise (UEN), increases with increasing number of harmonics using the frequency 700 Hz as the pure tone reference as well as fundamental frequency of the complex tone. For every doubling of the number of equal level tone components, the level of the components was decreased by 3 dB to keep the overall level constant. Hansen & Weber [18] compared a single-component tone at 700 Hz to a two-tone complex with frequency components at 650 Hz and 750 Hz (all presented in white background noise), asking the test participants to: (1) choose the test stimulus having “the higher magnitude of tonal content” and (2) “choose the test stimulus containing the louder tonal part”. The participants declared it was easier to answer number 2, estimating the loudness, but there was no significant difference in the outcome.
Hots et al. [16] measured the subjective estimates of masked threshold, tonality, and loudness of tonal content of complex harmonic tones masked by UEN [10]. They used an alternative forced choice (AFC) procedure, first asking which of the sounds that had the “higher loudness of tonal content” and then asking which stimuli had the “higher tonality”. The comparison was made between a reference sound and either a pure tone (175 Hz, 350 Hz, 700 Hz, or 1400 Hz) or a harmonic complex tone (350 Hz and 700 Hz or 175 Hz, 350 Hz, 700 Hz, and 1400 Hz), all presented with the same signal-to-noise ratio. The reference was a pure tone (700 Hz) presented at a fixed level over the broadband noise. The results of normal-hearing participants were compared with hearing-impaired listeners. The absolute threshold differed between these two groups, but the tonality and loudness estimates did not. Hots et al. showed that both magnitude of tonality and loudness of tonal content depend on the level above masking threshold and increases with increasing number of harmonics.
In a test examining loudness perception, where the masking of a single tone at 986 Hz was released using modulations of the background noise or binaural masking level difference, Verhey & Heise [19] found that tonalness and loudness of tonal content is linked. They claimed that the magnitude of tonal content can be measured by using loudness models and presented the hypothesis that perception of loudness depends on the partial loudness of the tonal content, which is backed up by Hansen & Weber [18].
The impact of non-harmonicity on tonal loudness perception is less investigated. To investigate the impact on tonality, Vormann et al. [25] compared harmonic to non-harmonic tone complexes in a pilot study. They compared complex two-tones presented in UEN, based on the fundamental frequency 700 Hz, using a 2-AFC procedure, but did not see any significant difference.
The question of perception and detectability of complex tones has been studied extensively. Despite this, there is no general conclusions on how to evaluate these kinds of sounds. Previous studies on the subject are mainly focusing on matching magnitude between two tonal sounds presented in noise, often in the vocal range [16, 24–26].
The aim of the present study is to further elaborate on this topic within the context of electrified propulsion. The main objective is to measure the sensation of loudness for sounds containing tonal and broadband noise generated by electrified trucks, where the frequency content includes the most sensitive area of human hearing.
This experiment investigates the perceived loudness of stationary harmonic as well as non-harmonic complex tones to:
-
(1)
Study how the number of components of a stationary complex tone affects perceived loudness of the tonal content, using different fundamental frequencies and a pink noise reference.
-
(2)
Investigate if harmonicity has an impact on loudness perception by detuning one component.
-
(3)
Compare two different matching methods (A and B) where the adjustable and fixed stimuli are switched.
The first hypothesis is that the subjective estimation of the loudness of a sound with complex tone content is made holistically. Therefore, the perceived loudness will increase with added harmonics to a greater extent than the spectral summation predicts. The second hypothesis is that if the complex tone is harmonic (the components are multiples of the fundamental), the perceived loudness will change if one component is detuned (creating a non-harmonic case).
2 Experiment
2.1 Method
Using the method of adjustment, a listening test of pairwise loudness matching of two different sounds was conducted. The subjects were instructed in writing on a computer screen as the test proceeded, see Figure 2. In addition, the test leader orally informed the subjects before the test about the task, the number of stimuli, and the expected completion time. The subjects were asked to match the perceived loudness of a reference sound to test sounds with a fixed level. The written instruction read “In this test you will compare tonal sounds to a broad band noise. The task is to adjust the volume of the test sound so that you perceive equal loudness compared to the reference sound”. Written instructions presented on a screen informed the participant which sound they listened to, test or reference, and which to adjust.
![]() |
Figure 2 The figure shows the instruction window presented to the participants via the screen, when performing the test. |
The sound level was adjusted with the scroll wheel of a computer mouse, changing the overall amplitude by 1 dB for each scroll step, see Figure 3. The subjects could switch between the test and reference sound as frequently as desired before finalizing the set equal loudness level. In total, 40 stimuli were assessed in a semi-randomized order. After the test was finished, a short interview was held about the strategy used for decision-making, hearing problems, and general feedback on the test. Information about age, gender, hearing status and listening experience was self-reported in the beginning on a digital form. Before taking the test, the participants had a training session to get used to the test setup. In total, the test took about 45 min.
![]() |
Figure 3 Schematic description of the test process. |
2.2 Experimental design and stimuli
The experiment was divided into three parts (Test 1–3).
Test 1 was a full 24 factorial design consisting of harmonic sound samples. All complex tone signals were mixed with a stationary background noise and compared to a level-adjustable noise (matching method A), see Figure 4. The test stimuli consisted of four different samples consisting of one to four harmonic partials. The reference signal consisted of pink noise. The complex tonal stimuli were presented at a constant sound pressure level (SPL). The level of the pink noise was adjusted to be perceived equally loud as the tone stimulus (method A). Four groups of tone stimuli with different fundamental frequencies were presented for each test, altogether 16 stimuli, see Table 1.
![]() |
Figure 4 A schematic figure of the test set-up in Tests 1 and 2 (method A). To simplify the schematic representation, the noise background for the tonal signal is not shown. |
Test setup. All test signals have added stationary background noise.
Tests 2 and 3 consisted of a full 23 factorial design. Test 2 used the same matching method as Test 1. Test 3 used the same sound samples as Test 2, but with a flipped matching method (method B), see Figure 5. The subjects were now asked to adjust the level of the complex tone stimulus instead, to match the perceived loudness of the pink noise presented at a constant SPL. The background noise mixed with the complex tone was kept on the same constant level, i.e., only the SPL of the tone components was adjusted.
![]() |
Figure 5 A schematic figure of Tests 3, method B. The dashed (red) component in the figure indicates that the component is shifted upwards in frequency, creating a non-harmonic complex tone. To simplify the schematic representation, the stationary noise background is not shown for the tonal signal. |
The background noise added to all the tonal samples consisted of an arbitrary magnitude filtered white noise. Frequency points and amplitudes were chosen to generate the profile of the frequency spectrum for a typical background noise of an electric truck driving at constant speed. The overall level of the background noise was adjusted to give a comfortable listening experience without masking the complex tone, see Figure 6.
![]() |
Figure 6 The figure presents the relation of the sound stimuli at the start of the different listening tests. Left: method A – The initial level of the fixed stationary complex four-partial tonal sound with 350 Hz fundamental frequency mixed with background noise (solid), and the adjustable pink noise (dashed). Right: method B – The initial level of the adjustable complex four-partial tonal sound with 350 Hz fundamental frequency mixed with the constant background noise (solid), and the stationary pink noise (dashed). |
Stimuli with four different fundamental frequencies were created (350 Hz, 950 Hz, 1750 Hz and 2750 Hz). The frequencies were chosen to resemble typical continuous complex tones generated by common components in electric trucks, such as fans, pumps, and electric machines. The presentation order of the fundamental frequencies was randomized. The stimuli in each frequency group were presented in a counterbalanced order based on a Latin Square design.
The stimuli were generated as looped continuous sounds, see Figure 3. The tone components were presented with the amplitude falling −6 dB/octave. This has been used in previous studies [17, 23] as levels of harmonics representative for electric machine sounds. The adjustable sound was initially presented at −10 dB overall level relative the reference level (amplification 0 dB) for that stimulus. Audio files for each signal are found in the Appendix A.
The harmonic stimuli consisted of a fundamental frequency (f0) and 0–3 harmonics (f1 − f3). The non-harmonic stimuli were based on the harmonic four-partial stimulus where one component was shifted upward in frequency (f1 or f2). The frequency shift was chosen as small as possible without causing roughness or other modulation-based phenomena (f1 = 2.3 * f0 and f2 = 3.4 * f0).
The Tone-to-Noise Ratio for the different tonal components in the fix tonal sound stimuli are analyzed with HEAD Acoustics ArtemiS SUITE v15.0 according to ECMA 418-1 [2] and presented in Tables 2 and 3.
Tone-to-noise ratio for the fixed four partial stimuli for each tone component in Test 1.
Tone-to-noise ratio for the shifted components in the fixed four partial stimuli in Test 2.
All procedures were performed in compliance with relevant laws and institutional guidelines. Informed consent was obtained from all participants before the test. Until the test period was completed the participants could withdraw their participation with no requirement for justification.
2.3 Participants
Twenty-four participants (5 female and 19 male) with self-reported normal hearing participated in the test. The average age of the participants was 39.5(σ = 10.2) years. All subjects were employed at Scania R&D apart from two PhD students from Luleå University of Technology. Eleven participants (46%) answered that listening was a part of their profession.
2.4 Apparatus
The stimuli were generated via MATLAB, run on a computer with an external soundcard (Behringer U-control UCA222), and presented diotically via headphones (Sennheiser HD 650). To calibrate the sound files, the output signals levels were measured with the headphones placed on an artificial head (HEAD acoustics HML 1.0/HMS III electric) and analyzed with the software HEAD ArtemiS SUITE v14.1.
The majority of the participants (22) were seated in a well-isolated and acoustically treated room (the control room for the listening studio at Scania in Södertälje). The background noise level was below 26 dBA. Two students were tested in a matching recording studio at Luleå University of Technology with a background noise level below 25 dBA, using the same apparatus.
2.5 Statistics
A confidence level of 95% was applied throughout the study.
2.5.1 Test 1
A mixed repeated measures factorial design with two within-subject factors, fundamental frequency, and number of partials, was used. Fundamental frequency had four levels (350 Hz, 950 Hz, 1750 Hz, and 2750 Hz). Number of partials had four levels (1, 2, 3, and 4 partials). Two between-subject factors, age with the levels < 40 years (14 participants) and ≥40 years (10 participants) and listening experience with the levels “not experienced” and “experienced”, were used. The dependent variable was deviation from the reference level at the point of subjective equality (PSE) between tonal sound and pink noise.
2.5.2 Tests 2 and 3
Tests 2 and 3 had the same design and variables as Test 1 except that the factor number of components was changed to harmonicity. Harmonicity had three levels; Level 1: harmonic, Level 2: second harmonic increased frequency, and Level 3: third harmonic increased frequency. A simple contrast analysis was used for harmonicity with Level 1 as a reference.
3 Results
3.1 Test 1
Box plots of level difference at PSE of loudness are found in Figure 7.
![]() |
Figure 7 Box plots of PSE comparing adjustable noise to fixed tonal sound. The stronger the tonal sound is perceived the higher will the deviation be set. The bottom and top edges of the boxes indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points, outliers excluded. Rings indicate outliers. The cross indicates the average value, and the line indicates the median. |
For the main effect of fundamental frequency, the Greenhouse–Geisser estimate of the departure from sphericity was ε = 0.70. This main effect was significant, F(2.09, 41.74)=39.2, p < 0.001. Contrasts revealed a significant linear relationship between perceived equal loudness and increasing fundamental frequency, F(1, 20)=80.91, p < 0.001, r = 0.90.
For the main effect of the number of components, the Greenhouse–Geisser estimate of the departure from sphericity was ε = 0.60. This main effect was significant, F(1.76, 35.23)=45.65, p < 0.001. Contrasts revealed a linear increase in perceived equal loudness with increasing number of components, F(1, 20)=92.05, p < 0.001, r = 0.91. The average within-subject difference between 1–2 partials was 2.1 dB, 2–3 partials 3.9 dB, and 3–4 partials 5.6 dB.
For the interaction between fundamental frequency and number of components, the Greenhouse–Geisser estimate of the departure from sphericity was ε = 0.56. This interaction effect was not significant F(4.99, 99.89)=2.26, p = 0.05.
The interaction effect between fundamental frequency and age was significant F(2.09, 41.74)=3.75, p = 0.03. The interaction effect for the number of components and age was not significant F(1.76, 35.23)=3.22, p = 0.06. Nor was any significant effect seen for interaction between listening experience and fundamental frequency F(2.09, 41.74)=2.09, p = 0.61 or number of components F(1.76, 35.23)=0.47, p = 0.61.
Table 4 presents levels for the different stimuli in Test 1. The A-weighted sound pressure levels, Loudness, and Sharpness are calculated from the derived and calibrated output signals. ΔL is the average difference in level from the reference level, see Figure 7.
Sound stimuli levels for Test 1. _x indicates number of components.
3.2 Test 2
Box plots of level difference at PSE of loudness are found in Figure 8.
![]() |
Figure 8 Box plots of the PSE for Test 2 comparing noise to fixed tonal sound. The higher the deviation is set, the stronger is the tonal stimuli perceived. 4:1 = four harmonics, 4:2 = non-harmonic sample – 2nd partial shifted, 4:3 = non-harmonics sample – 3rd partial shifted. |
For the main effect of fundamental frequency, the Greenhouse–Geisser estimate of the departure from sphericity was ε = 0.73. This main effect was significant F(2.19, 43.76)=22.28, p < 0.001. Contrasts revealed significant linear increase in perceived equal loudness with increasing fundamental frequency F(1, 20)=45.46, p < 0.001, r = 0.83.
For the main effect of harmonicity, Mauchly’s test of sphericity showed that the assumption of sphericity was met, p = 0.72. This main effect was not significant, F(2, 40)=0.30, p = 0.75.
Table 5 present levels and analysis results for the different stimuli in Test 2. The A-weighted sound pressure levels, Loudness, and Sharpness are calculated from the derived and calibrated signals. ΔL is the average difference level indicated by the participants compared to reference value responding to amplification of 0 dB of the sound stimuli, see Figure 8.
![]() |
Figure 9 Box plots of PSE for Test 3 comparing tonal sounds to a fixed pink noise. The background noise is kept at a constant level. The lower the deviation is set, the stronger is the tonal sound perceived. 4:1 = four harmonics, 4:2 = non-harmonic sample – 2nd partial shifted, 4:3 = non-harmonics sample – 3rd partial shifted. |
3.3 Test 3
Box plots of level difference at PSE of loudness are found in Figure 9.
For the main effect of fundamental frequency, Mauchly’s test of sphericity showed that the assumption of sphericity was met, p = 0.16. This main effect was significant F(3)=21.34, p < 0.001. Contrasts revealed significant linearity in perceived equal loudness with increasing fundamental frequency F(1, 20)=35.88, p < 0.001, r = 0.80.
For the main effect of harmonicity, Mauchly’s test of sphericity showed that the assumption of sphericity was met, p = 0.24. This effect was significant F(2, 60)=3.76, p = 0.032. Simple contrasts, with the harmonic condition as reference (Level 1), showed that there was a significant difference between Level 2 (second harmonic shifted) and Level 1, F(1, 20)=11.89, p = 0.003, r = 0.61. But not between Level 3 (third harmonic shifted) and Level 1. The average within-subject difference was 0.9 dB between Level 1 and Level 2.
For the interaction between fundamental frequency and harmonicity, the Greenhouse–Geisser estimate of the departure from sphericity was ε = 0.69. This interaction effect was not significant F(4.16, 83.20)=1.26, p = 0.292, r = 0.10. No statistical results indicate impact of age or listener experience on either harmonicity F(2, 120)< 1 or fundamental frequency F(3, 60)< 1.
Table 6 present levels and analysis results for the different stimuli in Test 3. The A-weighted sound pressure levels, Loudness, and Sharpness are calculated from the derived and calibrated signals. ΔL is the average difference level indicated by the participants compared to the reference values responding to amplification of 0 dB of the sound stimuli, see Figure 9.
Sound stimuli levels for Test 3.
4 Discussion
The level of a complex tone with high number of harmonics is perceived stronger than a measurement in dBA would predict, even though the components are given a continuously decreasing level (−6 dB per octave). This implies that the dBA-metric is insufficient for estimating the perception of loudness of complex tonal sounds. This is also mentioned by Doleschal et al. [17].
Results from Hots et al. [16], comparing the loudness of a single 700 Hz tone to complexes tone stimuli, raised the perceived level by approximately 3 dB when increasing the number of tone components from one to two. Increasing the number of tone components from one to four increased the level by 3–5 dB. The studies performed by Vormann et al. [25, 27], also using the fundamental at 700 Hz as reference, comparing a single tone to complex harmonic tonal sounds with the same signal-to-noise ratio for each harmonic, measured plus 5 dB difference between one and two harmonics and around 8 dB going from two to four harmonics. The present study measured a level difference of 2–3 dB, when adding one harmonic to the fundamental, which is in comparison with [16] but lower than [27]. The perceived level difference comparing one and four components, resulted in 4–8 dB, depending on fundamental frequency. This aligns more with [27]. Even though the methods used in the studies differ, and comparisons should be made with caution, they all point in the same direction. A rough estimate gives 2 dB general increase in loudness level per added harmonic, even if the harmonics drop in relative peak level with increasing frequency. Future studies need to explore this further.
On the contrary to Vormann et al. [24], who did not find any statistical difference comparing harmonic to non-harmonic tonal stimuli, harmonicity had an impact on loudness assessment in Test 3 (method B). This happened when the second harmonic was shifted. This was not seen in Test 2 (method A), which presented the same tonal sounds but used a different matching method. From other studies, we know that the order of the harmonic is important. Vormann et al. [24] found that adding the second harmonic had the largest impact on the difference in perceived tonality. The explanation might be masking effects, indicating that the level above masking is important for estimating the loudness of a complex tone [26], while the perceived loudness of a complex tone depends hypothetically on “the partial loudness of the tonal part” [18]. The difference in perceived loudness change might also be explained by the decrease in the relative level between the tone components. This means that the effect would be lower for adding a third harmonic than the second. However, this depends also on the frequency of the tone component [28]. Future studies are required to get a more detailed understanding of the impact of harmonicity on loudness, to see how levels and intervals of different harmonics affect loudness perception.
The variance was smaller for method B compared to method A. One reason might be the difference in the difficulty of the test task, considering that most participants explicitly found Test 3 easier to perform. The initial loudness difference between the test signal and the reference signal was perceived as smaller. Another aspect might be that the adjustable part only was the complex tones and not the background noise, making it easier to separate the tones from the background noise.
When asked how many tone components the participants had perceived, the majority answered one for most stimuli but that there had been at least two tonal components at the end. This coincided with the non-harmonic samples and was therefore expected, considering our ability to fuse harmonic components and separate those that are not [12]. However, some participants had not noticed this at all. Instead, they focused on comparing and estimating the loudness of the sound.
Questioned about the strategy for the decision of loudness estimation, seven participants (29%) answered that they had used their sense of annoyance or discomfort. Other participants answered that they had to concentrate hard to not consider annoyance and only focus on perceived loudness. Other studies have investigated the impact of annoyance and concluded that this is an important factor to consider, especially when the fundamental frequency is over 1 kHz [29, 30].
The majority found the pairwise comparison between a tonal sound and noise hard to do, “like comparing apples to pears”. The difference in character between the test and reference sounds is large, one having a clear tonal sound and the other having no tonal sound. The reason for choosing this non-tonal sound as reference was to avoid the effect of making added tone components tangible. If the comparison had been between two different tonal stimuli, the risk would have been that participants had started focusing on specific tone components or timbre differences instead of focusing on the difference in loudness. This phenomenon is mentioned by Hansen and Weber [18]. “Within-modality matches”, meaning that the participants match their sensation of magnitude between stimuli with different characters, is mentioned as a successful method by Gescheider [31]. Measuring equal loudness estimation for the same non-tonal sound repeatedly results in an indirect comparison between tonal test sounds.
The stationary background noise that was added to the tone was irritating for some participants and was also a cause of confusion because the instructions given did not mention anything about background noise. The masking effect on the upper harmonics of the complex tone with fundamental frequency 350 Hz is greater than the rest because the level difference between the background level and the tone components is smaller. The tonal samples including these harmonics also show the smallest change in perceived level difference.
The results show no statistical difference between people listening as a part of their daily work and those who do not. The impact of gender has not been analyzed because the number of females was too small to give a sufficiently large statistical basis.
Bias is inclined using the method of adjustment, as mentioned by Buus et al. [32]. Alternation between which stimuli that is fixed, and which are adjusted is recommended to avoid bias effects, while there is a tendency to perceive the second out of two identical tonal stimuli as louder. In the present study the participants could alternate freely between the test and reference signal, which meant that they could check their subjective estimation several times. This should remove the bias effect of a “stronger second” while there would not be any first or second after changing enough times. Besides, the comparison is made on sounds of different characters.
5 Conclusion
This experiment investigated the perceived loudness of stationary harmonic as well as non-harmonic complex tones. Consistent with previous studies, the estimation of perceived loudness of a complex tone increased with increasing number of harmonics. When measuring PSE using pink noise as reference the results show a linear average increase of 2 dB per added component, even when the individual tones decrease in amplitude with −6 dB per octave. Moving the second harmonic out of tune decreased the perceived loudness. Moving the third harmonic out of tune did not affect loudness perception. The results support the hypothesis that the subjective estimation of loudness of complex tone content is made holistically, and that the perceived loudness increase to a greater extent than spectral summation shows both in dB and Loudness (sone).
Changing measurement method, switching the reference to adjustable sound, did have an impact on the results. Both when it comes to the actual outcome, but also concerning the variance of the results, which decreased when the tonal content is adjustable, and the reference noise is held fixed.
Annoyance is a factor to consider especially for higher frequencies. Age affected the results related to sensation level for different fundamental frequencies. Age did not affect the estimated loudness of the tone when comparing different numbers of components or when comparing harmonic cases to non-harmonic cases. Listening experience did not affect the estimation of loudness in any case.
Tonality metrics estimating the prominence of a tonal sound by evaluating components separately are not sufficient when trying to evaluate the impact of sounds with complex tonal content that are harmonic or close to harmonic. Our findings demonstrate that the human brain generally estimates loudness of tone complexes holistically. Previous studies have come to similar conclusions. Therefore, users of established acoustic parameters describing tonality must be aware that there is a risk of oversimplification by describing the sound in this way. The mutual relation between tone components must be considered to give a reliable prediction of perceived loudness of sounds with tonal content. The industry requires standardized methods that include these facts, to facilitate setting up applicable requirements for sound quality.
Acknowledgments
The authors would like to acknowledge Scania CV AB for fully funding this project. We would also like to thank the many colleagues at Scania R&D and Luleå University of Technology, who have supported this project by enthusiastically participating in the tests as well as in discussions.
Conflicts of interest
The authors declare the following financial interests/personal relationships that may be considered potential competing interests: Birgitta Nyman is currently enrolled in the PhD program at Scania CV AB affiliated to Luleå University of Technology (Engineering Acoustics). This project is financed by Scania.
Data availability statement
The sound files associated with this article are available on Zenodo, under the reference [33].
Author contribution statement
Birgitta Nyman: Conceptualizing, Methodology, Software, Validation, Formal Analysis, Investigation, Data curation, Writing – original draft, Visualization, Project administration. Arne Nykänen: Conceptualization, Methodology, Software, Validation, Writing – review & editing, Supervision.
References
- E. Zwicker, E. Terhardt: Analytical expressions for critical-band rate and critical bandwidth as a function of frequency. The Journal of the Acoustical Society of America 68 (1980) 1523–1525. [CrossRef] [Google Scholar]
- ECMA: ECMA-418 Psychoacoustic metrics for ITT equipment, 2022. [Online]. Available: https://www.ecma-international.org/publications-and-standards/standards/ecma-418/. [Google Scholar]
- DIN 45681:2005-03 Acoustics – Determination of tonal components of noise and determination, 2005. [Google Scholar]
- ISO/PAS 20065:2016(E) Acoustics – Objective method for assessing the audibility of tones in noise – Engineering method. [Google Scholar]
- W. Aures: Berechnungsverfahren für den sensorischen Wohlklang beliebiger Schallsignale. Acta Acustica United with Acustica 59 (1985) 130–141. [Google Scholar]
- E. Terhardt, G. Stoll, M. Seewann: Algorithm for extraction of pitch and pitch salience from complex tonal signals. Acoustical Society of America 71 (1982)679–688. [CrossRef] [Google Scholar]
- ISO 1996-2:2017 Acoustics – Description, measurment and assessment of environmental noise – Part 2: Determination of sound pressure level, 2017. [Google Scholar]
- ECMA: ECMA TR/108 Proposal of new parameters, T-TNR and T-PR for total evaluation of multiple tones, 2019. [Online]. Available: https://www.ecma-international.org/publications-and-standards/technical-reports/ecma-tr-108/. [Google Scholar]
- C.J. Plack: The Sense of Hearing. Routledge, 2018. [CrossRef] [Google Scholar]
- E. Zwicker, H. Fastl: Psychoacoustics: Facts and Models. Vol. 22. Springer Science Business Media, 2013. [Google Scholar]
- R. Parncutt: Harmony: A Psychoacoustical Approach, M. R. Schroeder, Ed., Springer-Verlag, 1989. [CrossRef] [Google Scholar]
- A.S. Bregman: Auditory Scene Analysis: The Perceptual Organization of Sound. MIT Press, 1994. [Google Scholar]
- B.C.J. Moore: An Introduction to the Psychology of Hearing. Brill, 2012. [Google Scholar]
- E. Terhardt: Psychoacoustic evaluation of musical sounds. Perception Psychophysics 23 (1978) 483–492. [CrossRef] [PubMed] [Google Scholar]
- W.M. Hartmann: Signals, Sound, and Sensation. Springer Science Business Media, 2004. [Google Scholar]
- J. Hots, S. Ashraf Vaghefi, J.L. Verhey: The effect of sensorineural hearing loss on suprathreshold perception of tonal components in noise. JASA Express Letters 2 (2022) 084401. [CrossRef] [PubMed] [Google Scholar]
- F. Doleschal, H. Rottengruber, J.L. Verhey: Influence parameters on the perceived magnitude of tonal content of electric vehicle interior sounds. Applied Acoustics 181 (2021) 108155. [CrossRef] [Google Scholar]
- H. Hansen, R. Weber: Partial loudness as a measure of the magnitude of tonal content. Acoustical Science and Technology 32 (2011) 111–114. [CrossRef] [Google Scholar]
- J.L. Verhey, S.J. Heise: Suprathreshold perception of tonal components in noise under conditions of masking release. Acta Acustica United with Acustica 98 (2012)451–460. [CrossRef] [Google Scholar]
- H. Fastl, G. Stoll: Scaling of pitch strength. Hearing Research 1, 4 (1979) 293–301. [CrossRef] [PubMed] [Google Scholar]
- E. Zwicker, G. Glottorp, S.S. Stevens: Critical band width in loudness summation. The Journal of the Acoustical Society of America 29, 5 (1957) 548–557. [CrossRef] [Google Scholar]
- B. Scharf: Loudness summation and spectrum shape. Journal of Acousticic Society of America 33 (1961) 838–839. [CrossRef] [Google Scholar]
- F. Doleschal, J.L. Verhey: Pleasantness and magnitude of tonal content of electric vehicle interior sounds containing subharmonics. Applied Acoustics 185 (2022) 108442. [CrossRef] [Google Scholar]
- M. Vormann, J.L. Verhey, V. Mellert, A. Schick: Ein adaptives Verfahren zur Bestimmung der subjektiven Tonhaltigkeit. Fortschritte der Akustik 26 (2000) 304–305. [Google Scholar]
- M. Vormann, J.L. Verhey: Factors influencing the subjective rating of tonal components in noise. The Journal of the Acoustical Society of America 105 (1999) 1297–1298. [CrossRef] [Google Scholar]
- H. Hansen, J.L. Verhey, R. Weber: The magnitude of tonal content: a review. Acta Acustica United with Acustica 97 (2011) 355–363. [CrossRef] [Google Scholar]
- M. Vormann, J.L. Verhey, V. Mellert, A. Schick: Subjective rating of tonal components in noise with an adaptive procedure, in: Contributions to Psychological Acoustics: Results of the Eighth Symposium on Psychological Acoustics, 2000. [Google Scholar]
- ISO 226:2003 Normal equal-loudness-level contours, 2003. [Google Scholar]
- R. Sottek, J. Becker: Tonal annoyance vs. tonal loudness and tonality, in: INTER-NOISE and NOISE-CON Congress and Conference Proceedings, 2019. [Google Scholar]
- G. Pietila, W. Seldon, T. Roggenkamp, T. Bohn: Tonal annoyance metric development for automotive electric vehicles. SAE Technical Paper 2019-01-1467, 2019. [Google Scholar]
- G.A. Gescheider: Psychophysical scaling. Annual Review of Psychology 39 (1988) 169–200. [CrossRef] [PubMed] [Google Scholar]
- S. Buus, M. Florentine, T. Poulsen: Temporal integration of loudness, loudness discrimination, and the form of the loudness function. The Journal of the Acoustical Society of America 101 (1997) 669–680. [CrossRef] [PubMed] [Google Scholar]
- B. Nyman: Supplementary material for: Loudness matching of complex tones simulating sounds from electric trucks. In Acta acustica. Zenodo, 2025. https://doi.org/10.5281/zenodo.14637572 [Google Scholar]
Appendix A
Below you find the presentation of all the sound files used within this experiment as well as the audio file presented under Figure 1.
A.1 Audio files
Audio file #1 Passage of an electric truck, driving at 20 km/h.
Audio file #2: Reference sound, pink noise Audio_file_2.mp3 .
Audio file #3: One partial sound mixed with background, 350 Hz Audio_file_3.mp3 .
Audio file #4: Two partials sound mixed with background, 350 Hz Audio_file_4.mp3 .
Audio file #5: Three partials sound mixed with background, 350 Hz Audio_file_5.mp3 .
Audio file #6: Four partials sound mixed with background, 350 Hz Audio_file_6.mp3 .
Audio file #7: One partial sound mixed with background, 950 Hz Audio_file_7.mp3 .
Audio file #8: Two partials sound mixed with background, 950 Hz Audio_file_8.mp3 .
Audio file #9: Three partials sound mixed with background, 950 Hz Audio_file_9.mp3 .
Audio file #10: Four partials sound mixed with background, 950 Hz Audio_file_10.mp3 .
Audio file #11: One partial sound mixed with background, 1750 Hz Audio_file_11.mp3 .
Audio file #12: Two partials sound mixed with background, 1750 Hz Audio_file_12.mp3 .
Audio file #13: Three partials sound mixed with background, 1750 Hz Audio_file_13.mp3 .
Audio file #14: Four partials sound mixed with background, 1750 Hz Audio_file_14.mp3 .
Audio file #15: One partial sound mixed with background, 2750 Hz Audio_file_15.mp3 .
Audio file #16: Two partials sound mixed with background, 2750 Hz Audio_file_16.mp3 .
Audio file #17: Three partials sound mixed with background, 2750 Hz Audio_file_17.mp3 .
Audio file #18: Four partials sound mixed with background, 2750 Hz Audio_file_18.mp3 .
Audio file #19: Four partial sound mixed with background, 350 Hz, 2nd partial shifted Audio_file_19.mp3 .
Audio file #20: Four partial sound mixed with background, 350 Hz, 3rd partial shifted Audio_file_20.mp3 .
Audio file #21: Four partial sound mixed with background, 950 Hz, 2nd partial shifted Audio_file_21.mp3 .
Audio file #22: Four partial sound mixed with background, 950 Hz, 3rd partial shifted Audio_file_22.mp3 .
Audio file #23: Four partial sound mixed with background, 1750 Hz, 2nd partial shifted Audio_file_23.mp3 .
Audio file #24: Four partial sound mixed with background, 1750 Hz, 3rd partial shifted Audio_file_24.mp3 .
Audio file #25: Four partial sound mixed with background, 2750 Hz, 2nd partial shifted Audio_file_25.mp3 .
Audio file #26: Four partial sound mixed with background, 2750 Hz, 3rd partial shifted Audio_file_26.mp3 .
Cite this article as: Nyman B.E. & Nykänen A. 2025. Loudness matching of complex tones simulating sounds from electric trucks. Acta Acustica, 9, 17. https://doi.org/10.1051/aacus/2024090.
All Tables
Tone-to-noise ratio for the fixed four partial stimuli for each tone component in Test 1.
Tone-to-noise ratio for the shifted components in the fixed four partial stimuli in Test 2.
All Figures
![]() |
Figure 1 Frequency analysis of external noise recording of an electric truck at close range, driving at 20 km/h. The sound contains several strong tonal components Audio_file_1.mp3 . |
In the text |
![]() |
Figure 2 The figure shows the instruction window presented to the participants via the screen, when performing the test. |
In the text |
![]() |
Figure 3 Schematic description of the test process. |
In the text |
![]() |
Figure 4 A schematic figure of the test set-up in Tests 1 and 2 (method A). To simplify the schematic representation, the noise background for the tonal signal is not shown. |
In the text |
![]() |
Figure 5 A schematic figure of Tests 3, method B. The dashed (red) component in the figure indicates that the component is shifted upwards in frequency, creating a non-harmonic complex tone. To simplify the schematic representation, the stationary noise background is not shown for the tonal signal. |
In the text |
![]() |
Figure 6 The figure presents the relation of the sound stimuli at the start of the different listening tests. Left: method A – The initial level of the fixed stationary complex four-partial tonal sound with 350 Hz fundamental frequency mixed with background noise (solid), and the adjustable pink noise (dashed). Right: method B – The initial level of the adjustable complex four-partial tonal sound with 350 Hz fundamental frequency mixed with the constant background noise (solid), and the stationary pink noise (dashed). |
In the text |
![]() |
Figure 7 Box plots of PSE comparing adjustable noise to fixed tonal sound. The stronger the tonal sound is perceived the higher will the deviation be set. The bottom and top edges of the boxes indicate the 25th and 75th percentiles, respectively. The whiskers extend to the most extreme data points, outliers excluded. Rings indicate outliers. The cross indicates the average value, and the line indicates the median. |
In the text |
![]() |
Figure 8 Box plots of the PSE for Test 2 comparing noise to fixed tonal sound. The higher the deviation is set, the stronger is the tonal stimuli perceived. 4:1 = four harmonics, 4:2 = non-harmonic sample – 2nd partial shifted, 4:3 = non-harmonics sample – 3rd partial shifted. |
In the text |
![]() |
Figure 9 Box plots of PSE for Test 3 comparing tonal sounds to a fixed pink noise. The background noise is kept at a constant level. The lower the deviation is set, the stronger is the tonal sound perceived. 4:1 = four harmonics, 4:2 = non-harmonic sample – 2nd partial shifted, 4:3 = non-harmonics sample – 3rd partial shifted. |
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.