Open Access
Scientific Article
Issue
Acta Acust.
Volume 5, 2021
Article Number 49
Number of page(s) 13
Section Musical Acoustics
DOI https://doi.org/10.1051/aacus/2021045
Published online 19 November 2021

© J. Jaatinen et al., Published by EDP Sciences, 2021

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

The auditory sensation evoked by a musical tone, according to Fletcher [1], can be divided into characteristics of pitch, loudness, and timbre. These aspects all correlate with each other, that is, the pitch is primarily related to the frequency, but spectral differences and intensity may also influence the perceived pitch. However, the relationship between perceived pitch, harmonic component levels, and intensity in complex tones has remained mostly unresolved, in particular in the lowest and highest musical pitch ranges. The present study aims to disentangle the question on the subjective accuracy and influence of harmonic components on perceived pitch in the low-frequency range. This is done with a listening experiment combined with an analysis of the spectral differences between the evaluated tones.

Pitch is a perceptual attribute that allows the ordering of sounds on a frequency-related scale from low to high. The fundamental frequency is the corresponding physical term, defined as the inverse of the period of the signal. The perceived pitch is highly dependent on the listener and can be quantified only by a listening test, where the pitch of the evaluated tone is compared with the pitch of another tone serving as a reference.

In typical low-cost electronic tuning machines [2], the pitch detection algorithm typically estimates the period of a quasiperiodic signal and transforms the length of the period to the fundamental frequency. In electronic tuning machines, the relative amplitudes of the partials of a harmonic complex tone should not influence the estimated periodic pitch.

The pitch perception of a complex tone is intricate and not even fully explained. Schouten et al. [3] defined the perceived pitch of a complex tone as a joint perception of several individual harmonics (see also virtual pitch by Terhardt [4, 5]). In a complex tone, all its harmonics, that is, individual sinusoids, are present simultaneously and are separately coded as neural patterns. Although each harmonic component is coded individually, the aggregate of harmonics is perceived as one tone with a single pitch [6].

In a previous experiment [7], we observed that the perceived pitches of the low-register orchestra instruments at C1 (32.5 Hz) differed notably from each other, although the fundamental frequency and harmonic overtone frequencies were identical. In particular, the contrabass clarinet was perceived even one semitone lower than all other bass instruments. Although there are few studies available that report the influence of the harmonic spectrum envelope on perceived pitch [811], no comprehensive explanation has been published earlier to the authors’ knowledge. Latent connections between spectral patterns and perceived pitch are difficult to distinguish without a dimension reduction technique. We approached this phenomenon by principal component analysis (PCA) [12], by which such patterns with certain tendencies can be isolated.

2 Background

Pitch perception is fundamentally influenced by the physical properties of the evaluated tone. In the context of the current study, variation of the relative levels of harmonic overtones [13] is a central influencing factor. As one underlying effect, some harmonics might be masked by other stronger harmonics, possibly altering the perceived pitch. The result may be considered as a synthesis of different neurophysiological and psychoacoustical mechanisms.

The following sections shortly review previously proposed mechanisms for the influence of the spectral envelope on the perceived pitch.

2.1 Pitch of individual sinusoids

A harmonic spectrum consists of individual sinusoids which can be perceived differently depending on their intensity and frequency. Stevens [14] observed that the perceived pitch of high sinusoidal tones (>2500 Hz) increased with intensity (up to 13.5% at 12 kHz), whereas the perception of low sinusoids (<2500 Hz) showed an opposite effect of decreased pitch (up to 6% at 150 Hz). He suggested that the resonant characteristics of the ear acted as a divider. Snow [15] validated this phenomenon in the low register (<1000 Hz). Morgan and Galambos [16] observed high interindividual differences in the direction and magnitude of pitch shifts when the intensity was changed. They suggested that irregularities in auditory sensitivity between individual listeners may explain even contradictory observations with the same stimuli. Cohen [17] in part replicated Stevens’ experiment with more participants with a similar outcome. Although the directions of the pitch shifts were congruent, the magnitudes of the shifts were smaller (<2%).

In summary, the pitch of individual sinusoidal tones may bend down or upwards depending on intensity, frequency, and listener. This may influence the perceived pitch of a complex tone if downwards or upwards-shifted individual harmonics are dominant. This effect has been considered in advanced pitch detection algorithms (e.g., Terhardt’s model [13]). Short reviews of previous studies on how individual harmonics (or groups of harmonics) may influence the perceived pitch are given in the following sections.

2.2 Terhardt’s algorithm for simulating pitch perception

Moving from a pure tone to multiple sinusoids, which together constitute a complex tone, introduces a range of overlapping aspects.

Terhardt et al. [5] and later Terhardt et al. [13] presented an algorithm for simulating pitch perception in a human listener including a number of auditory processes. The final estimation of pitch is dependent on spectral pitch (analytic listening to individual harmonics) and virtual pitch (holistic listening to one evoked pitch). In this model, there are two separate pitch modes, the spectral pitch pattern (SP) and the virtual pitch pattern (VP). Both patterns include several pitch values and weights that determine the relative prominence of the individual pitches. SP includes spectral analyses, extraction of tonal components, masking of other harmonics, pitch shifts of individual harmonics, and weighting of spectral dominance. VP is derived from SP with an algorithm for subharmonic coincidence. Attributing to the original Terhardt’s code, an implementation in AARAE package [18] by Cabrera was employed in the current study.

2.3 Spectral centroid

The spectral centroid describes the balance point in frequency for the spectral energy. It is a robust indicator of the perceived “brightness” of a complex tone [19]. In simplified experiments, when the spectral centroid is moved from low to middle and high harmonics, its influence on the change in perceived pitch and timbre is possible to evaluate. Singh and Hirsh [8] explored how changes in spectral locus and fundamental frequency affect perceived pitch. They found that changes in fundamental frequency were the primary cue, but also changes in the spectral centroid had some influence on the pitch perception. “Brighter” or “sharper” tones, corresponding to a higher spectral centroid, were perceived as higher in pitch. However, in our opinion, listeners may not be able to distinguish high pitch from bright timbre with the applied listening test method. All in all, this effect may also be explained by the listeners’ attention to individual harmonics, i.e., analytic listening.

2.4 Dominance region

The spectral region which is most important to the pitch perception is called the dominance region. Ritsma [20] found that the six lowest harmonics dominate the pitch perception as long as their amplitudes exceed the hearing threshold level (fundamental frequencies were 100, 200, and 400 Hz). Moore et al. [21] evaluated the existence of a dominance region by mistuning individual harmonics. They concluded that the dominating harmonics were within the six lowest partials, at least for fundamental frequencies of 100, 200, and 400 Hz. In a later study, Jackson and Moore [22] suggested that in the case of complex tones with low fundamental frequency, the low harmonics in general mask the information of the higher harmonics.

Dai [23] argued that in the case of complex tones with a low fundamental frequency (<800 Hz), harmonics close to a fixed frequency at about 600 Hz are the most important or dominating in the pitch perception, irrespective of their rank order. This result contrasts with the findings above where the dominating harmonics are claimed to depend on the fundamental frequency (fixed harmonic order). Moreover, Terhardt et al. [5] proposed that there is a spectral dominance region that is symmetrical in log frequency and centered on 700 Hz, i.e., in absolute frequency. Partials nearer the center of that curve are more likely to be dominant.

2.5 Influence of spectral envelope change on perceived pitch in a musical context

There are only a few studies available on the influence of changes in the envelope of a harmonic spectrum on the perceived pitch in a musical context. Russo and Thompson [9] investigated musical intervals and their perceived size when the slope of the envelope of a harmonic spectrum was manipulated. The results indicated that changes in the spectrum envelope did not directly affect the perceived pitch, but the size of the intervals was experienced as expanded or contracted.

Vurma and Ross [10] had a more musical approach to the topic. In their first experiment, professional classical singers matched the pitches of their voices to synthesized oboe and piano tones. Since the participants sang in different registers from bass to soprano, it was clear that all participants did not match their voice to a common reference tone. However, the vocal sounds were on average 7–13 cents lower than the instrumental sounds.

In the second experiment, participants rated in a forced-choice task whether the instrumental sounds were lower, equal, or higher with respect to the vocal sound of a single performer. As a result, an approximately 20 cents lower vocal sound was perceived to be in tune with an instrumental sound. This, in contrast, indicates that the pitch of the vocal sound was perceived as higher than the corresponding instrumental sound with the same fundamental frequency. As an explanation, they suggested differences in the energy distributions of the power spectra. In a second paper, Vurma et al. [11] had more participants (13 musicians and 13 nonmusicians) and different instruments (tenor opera singer, viola, and trumpet). The experimental design applied pairwise comparison of successively presented tones. The tones described by the authors having a brighter overall timbre (trumpet and tenor voice) were perceived about 15–20 cents higher than the viola, which on its turn had a darker timbre. However, no deeper analysis of a possible influence of differences in the spectral envelope was presented.

2.6 Psychoacoustical vs. neurophysiological methods

The short overview above of some psychoacoustic studies on pitch perception indicates that several underlying neurophysiological processes influence the perceived pitch of a complex tone. The processes include, among others, pitch bending of an individual harmonic, masking of harmonics, and irregular auditory sensitivity in listeners. In combination, they may cause interindividual deviations between listeners’ perceptions.

In addition to psychoacoustical methods, different neurophysiological techniques for measuring pitch-related neural representations at different locations in the auditory pathway are available all the way from the auditory periphery to the auditory cortex [2428]. However, current neuroimaging techniques cannot clarify how the spectral properties of complex stimulus tones influence the perception of pitch. Hence, we must continue to approach the multifaceted phenomenon of pitch perception of complex tones with psychoacoustic methods.

2.7 Aim and motivation for this study

This study investigates how the perceived pitch in the low-frequency range depends on the spectrum envelope, dynamic level, and listener, using stimuli with complex tones which represent different orchestral instruments.

The focus is on how perceived differences in pitch can be attributed to differences in the spectrum envelope. The study was motivated by a phenomenon observed in an earlier experiment [7], where the pitches of contrabass clarinet tones were perceived to be lower in pitch in comparison with other bass instruments. The presented approach, combining a listening experiment using stimulus tones from orchestral instruments with a principal component analysis on the spectral differences between the evaluated stimuli, has not been reported earlier.

3 Methods

3.1 Participants

The participants (N = 31) of the listening experiments were professional orchestra musicians, aged from 31 to 61 years (mean 46.8, SD 8.1; 11 females, 20 males). Most of them are instrument section principals and employed by top-tier Finnish symphony orchestras (Helsinki Philharmonic Orchestra, Finnish Radio Symphony Orchestra, Tapiola Sinfonietta, Turku Philharmonic Orchestra, Finnish National Opera Orchestra). Five subjects reported absolute pitch. All orchestra instruments had at least one representative: 11 strings, 9 woodwinds, 8 brass, 1 harp, 1 piano, and 1 percussion.

Although none of the participants self-reported a severe hearing loss, it is obvious that some of them may have mild hearing loss due to the profession and/or age. Usually, severe hearing loss does not occur at lower frequencies, at least not in people who are working as professional musicians in a symphony orchestra. In our experiment, all stimuli had a low fundamental frequency (≤110.5 Hz ±100 cents). According to Moore et al. [21], in the low register tones, the six lowest harmonics are the most important for pitch perception. For that reason, we suppose that a possible hearing loss in the range of higher harmonics (>2 kHz) does not influence significantly the perceived pitch in our experiment.

3.2 Stimuli

The listening experiment included stimuli based on processed samples of four instruments: double bass (db), contrabassoon (cbsn), contrabass clarinet (cbcl), and bass tuba (tb). Most samples were extracted from the Vienna Symphony Library (VSL GmbH, Vienna, Austria). The contrabass clarinet and the additional instrument samples in the lowest extrema were captured by the authors with professional musicians. The experiments included a total of 108 individual samples: four instruments at nine musical pitches, and three dynamic levels.

Since the A musical pitch (note) is a common reference tone (A4), we chose A0–A2 for the experiment’s musical pitch range despite the fact that no orchestra instrument can play below B♭0. The A0 tones were derived from B♭0 tones. B♭0 was in the normal playable range of all instruments except the double bass. A five-string double bass was tuned down from B0 to B♭0 in the recording session.

The attack portion of an instrument tone in the lowest register can be unstable and ambiguous (e.g. tuba). The time until the tuning of a tone reaches a stable state could therefore be long, which in part might strongly influence the results. Therefore, all attacks were removed by editing the recordings and using the stable-tuned part in steady-state wavetable synthesis. A short section from the sustain phase of an instrument signal was isolated and oversampled at 384 kHz sample rate at 32 bits for extracting a single period in Wavelab 9.5 software (Steinberg GmbH, Hamburg, Germany). The high sample rate enabled precise period isolation to avoid artificial discontinuities. After the isolation of an individual period, the spectrum of the sample was analyzed to verify that the spectrum was not distorted due to a discontinuity. Each of the single periods derived from the sustain part was carefully aurally chosen so that they sounded as natural as possible. For that purpose, single periods were multiplied to obtain a suitable duration for evaluation. Finally, the selected periods were imported to Matlab 2018a (Mathworks Inc, MA, USA) for further processing.

The final stimulus signals were generated by repeating the single wave period by a known integer multiple and applying resampling to attain an accurately tuned complex harmonic tone with the desired length. Fundamental frequency calculation and the stimulus synthesis procedure were identical to our previous study, and the entire process has been explained thoroughly in [7], including frequency resolution considerations. Since all instrument samples were extracted from real instruments, the phases of the harmonics were closely representative of the actual instruments.

The subjects reported the instruments to be well recognizable and distinguishable by timbre, even though they did not have an attack, nor did they include small period-to-period fluctuations in frequency and amplitude (jitter and/or shimmer).

In order to facilitate a repeatable comparison anchor for possible future studies, we included a purely artificial reference instrument with a sawtooth-like waveform in our experimental design. This type of tone was motivated by two factors. First, many acoustical instruments have spectral envelopes relatively close to a 1/f spectral envelope corresponding to a sawtooth waveform. Second, due to the shifts in the perceived pitch of pure tones at different SPLs [1417], a certain number of harmonics was found necessary to include in the reference complex tone for producing a more stable perception of the pitch [29].

Various orders of harmonics were evaluated aurally by the authors to arrive at a pleasant and adequately natural timbre of the sawtooth. The implemented sawtooth waveform was synthesized in the frequency domain as a zero-phase 1/f amplitude spectrum with 10 harmonics. The choice was a compromise between the stability of perceived pitch and timbre pleasantness. Three supplemental harmonics with a linear amplitude fade out to zero were added above the first ten partials. This addition was due to the observation that an abruptly truncated harmonic series produces a “ghost” tone in perceived pitch, which was not present in the Fourier analysis [30]. It should be noted that the reference sawtooth-like tones only serve as additional data, supplementing the central comparisons between instrument tones. All stimuli were unfiltered.

In Zenodo [31], a complete set of example stimuli is provided in monaural audio files with the following sequence: fixed tone; adjusted tone −6 cent; fixed; adjusted 0 cents; fixed; adjusted +6 cents. Moreover, a complete set of figures including sound pressure level spectra, harmonic peaks, and respective spectral differences for the 10 first harmonics are provided [32].

3.3 Experiment design

The task of a participant in the listening experiment was to adjust pairs of two successive tones to perceived unison. Each presented pair consisted of two different instrument tones, where the first one was a fixed tone as a reference and the second one a user-adjustable tone with a tuning range of ±100 cents. Tuning adjustment steps were 3 cents [for adjustment between 0 and ±12c], 4 cents [±12–28c], 5 cents [±28–48c], 6 cents [±48–72c], and 7 cents [±72–100c].

All possible 12 permutations of the four instruments as the reference tone and adjustable tone in a stimulus pair were not included in the test. Three instruments were used for the reference tone and all four instruments for the following adjustable tone according to the following scheme, giving a total of six stimulus combinations of instruments (reference – adjustable): contrabassoon – contrabass clarinet/tuba, double bass – contrabassoon/tuba, and contrabass clarinet – double bass/tuba. For pairs including the sawtooth spectrum, this tone was always the reference yielding four combinations.

Nine musical pitches were included (A0, C1, E♭1, F♯1, A1, C2, E♭2, F♯2, and A2). These notes correspond to a fundamental frequency range of 27.6–110.5 Hz. In all presented pairs, the first reference tone was tuned to equal-tempered fundamental frequency derived from A4 (442 Hz), and the second tone started arbitrarily from a fundamental frequency between ±15 cents from the reference.

The total number of test pairs was 270 for each subject. All test pairs were different, no repeats were included. This means that every subject evaluated each individual test pair only once. The pairs were presented in random order and each tone was 1 s long. We considered that for the lowest tones it is necessary to have a long enough duration for perceiving a stable pitch. With a shorter duration, there may not be enough periods for accurate perception in the lowest register. Krumbholz et al. [33] used 800 ms and 1000 ms stimulus duration. In addition, Rogala et al. [34] showed that longer tones with a duration of 500–1000 ms are preferable in the lowest register for stable pitch perception.

In contrast to many studies on pitch discrimination, there was no silent gap between the two tones in a stimulus pair. A gap of 500 ms, for example, in Ref. [35], has been traditionally motivated by the circumvention of so-called streaming effects with uninterrupted audio stimuli. However, the present experiment was intended to represent the evaluation of musical intervals as would be experienced by a musician playing a monophonic instrument.

The test pair was repeated continuously until the “next” or “stop” button was pressed. The average duration of the experiment was 1.5 h including regular pauses.

The amplitudes of the tones were equalized to C-weighted sound pressure levels of 62, 68, and 74 dB, respectively, representing three nominal dynamic levels (ppmfff). Equalizing did not change the relative amplitudes between the harmonics in the stimulus tone. Feasible ranges for sound pressure levels were piloted by the authors. The range of presented sound pressure levels with different dynamic levels was reduced from the range that is possible for most orchestra instruments: 62 dB can be regarded only moderately soft for pianissimo, as many instruments can reach considerably lower SPL. Furthermore, most instruments produce substantially more than 74 dB in fortissimo. The lower bound was chosen so that all tones could still be heard and tuned with relative ease. Correspondingly, the upper bound was limited in order to avoid presenting unpleasantly loud sound pressure levels, considering the duration of the experiment.

In this context, it should be reminded that the stimulus tones presented at different dynamic levels were not just linearly scaled versions of one and the same spectrum. Besides the variation in sound pressure level between the three dynamic levels, the spectra of the stimulus tones also included the dynamic changes in the spectrum envelope (i.e., disproportionate amplification of higher overtones) exhibited by each instrument type.

SPL calibration values were measured with a fixture where the headphones rested against a small panel, and a calibrated sound level meter was attached flush through the fixture at the location of the ear canal entrance.

The listening test system was implemented with Max MSP 8.0 software (Cycling ‘74 Inc., CA, USA) and sound reproduction was performed with headphones (AKG K550, AKG Acoustics GmbH, Vienna, Austria). A Zoom H6 (Zoom Corporation, Tokyo, Japan) portable recorder was used as an external sound card and a digitally controlled headphone amplifier.

3.4 Statistical analysis and virtual pitch estimation

The collected data were first subjected to regression analysis with the pitch and instrument pairs as primary independent variables. Spectral centroids, as well as individual magnitudes of the 10 first harmonics, were calculated for each tone for subsequent analyses.

Statistical analysis was conducted by computing Pearson’s correlation coefficient between the calculated distances between the spectral centroids of the compared instrument tones and the perceptually adjusted tuning values of the same tones.

The term “tuning value” was introduced to indicate the difference in fundamental frequency (in cents) between the adjustable tone and the reference tone (in that order) when the two tones are perceived to have the same pitch. A positive tuning value means that the adjustable tone has a higher fundamental frequency than the reference when the two tones are perceived to be tuned in unison. Turned the other way round, a positive tuning value means that the reference tone is perceived to be higher in pitch than the adjustable tone when the two tones have the same fundamental frequency.

The data were explored by correlation analysis and visual inspection for relationships between the dependent variable (tuning value) and the independent variables described above. Initial observations directed our interest toward the trend in the overall deviation of tuning values as a function of the fundamental frequency. Subsequently, we applied principal component analysis (PCA) for investigating potential latent connections between the spectral features and the perceived pitch.

Terhardt’s virtual pitch was estimated with the Matlab implementation (AARAE9 toolbox) of the original algorithm [18]. However, the virtual pitch estimation was observed producing values deviating over one semitone from the nominal fundamental frequency for tones below A1. The same behavior was detected with the actual stimuli as well as pure tone testing signals. Hence, the virtual pitch estimates that deviated from the nominal pitch of more than 100 cents were omitted.

3.5 Principal component analysis

In principal component analysis, the original multidimensional data are transformed to a new space with the aim of describing the data by a few salient data dimensions. The new dimensions are denoted as principal components and are orthogonal to each other. The first principal component explains the maximum amount of variance in the original data and the following explain successively decreasing amount of the total variance.

PCA was conducted with the FactoMineR package in R environment [36]. The analysis was based on comparing the spectral differences between the presented instrument pairs. The magnitude differences (in dB scale) between the 10 first harmonics of the compared instrument tones were used as variables. The mean tuning values (see Sect. 3.4) of the same tones were used as supplementary variables. That is, the tuning values did not influence the PCA; they were only projected on the resulting dimensions. This approach facilitates correlation analysis between tuning values and the principal components which characterize the differences between compared spectra. Furthermore, instrument pairs and dynamic levels were included in the PCA as additional grouping factors.

A number of ten harmonics were included in the PCA based on the following two principles. First, the 10 harmonics enabled a consistent comparison between all instrument pairs, including the synthesized sawtooth tone, which has only 10 harmonics that strictly follow 1/f spectrum. Second, according to earlier studies by Ritsma [20] and Moore et al. [21], the lowest harmonics are the most significant for the pitch perception of the low fundamental frequency tones.

PCA analysis was conducted in two separate runs. In the first case, stimulus pairs including the sawtooth were omitted from the PCA, as the synthesized spectral envelope was perfectly equal in all sawtooth tones. Consequently, the difference between the instrument spectrum and the constant envelope of the sawtooth spectrum would dominate as the first PCA component and therefore complicate the interpretation of the instrument pairs. In the second run, the PCA included only pairs with the sawtooth and an instrument.

Furthermore, we sharpened the dataset for the PCA by performing a t-test with criteria p < 0.05. The small-p-value dataset included only stimulus pairs where the variability of the tuning values was relatively small, and the differences between the means of the tuning values for the reference and adjusted tone, respectively, deviated from zero with statistical significance. The sharpened dataset included 91 out of 162 pairs with instruments only, and 46 out of 108 pairs that included the sawtooth waveform.

4 Results

The collected data were aggregated and inspected by the basic categories of instrument pair, dynamic level, and pitch.

Figure 1 shows the grand average of the subjects’ preferred tuning values, including all instrument combinations and dynamic levels, combined in a single boxplot. The order of presentation of the instruments in the stimulus pairs (reference tone – adjusted tone) is not taken into account. For example, the contrabass clarinet and the contrabassoon appear both as a reference and adjusted tone in different stimulus pairs.

thumbnail Figure 1

Boxplot visualization of the grand averages of the tuning values across all instrument pairs and dynamic levels (i.e., sawtooth omitted). The tuning value indicates the difference in fundamental frequency (in cents) when the adjustable tone and the reference tone are perceived to have the same pitch. Horizontal axis indicates musical pitches.

The figure illustrates that the variability of the tuning values is substantially larger for the lowest tones and decreases towards the higher. At the lowest musical pitch (A0), the interquartile range (25th–75th quantile) spans an interval of about 70 cents, reducing to 20 cents at the highest musical pitch (A2). Outliers reaching ±100 cents are present from A0 up to C2. The median values lie just below the zero line for all musical pitches, the only exception being A0 which reaches zero.

When looking at tuning values for the individual instrument pairs at different dynamic levels in Figure 2 it is hard to identify any general trend other than the increase in variability towards the lowest tones observed in Figure 1. In Figure 2, the order between the reference instrument tone and the adjusted tone is preserved. It is seen that the tuning value medians often vary between positive and negative values within one instrument pair. This suggests that the pitches of the tones of a particular instrument are heard seemingly randomly higher or lower than the compared instrument.

thumbnail Figure 2

Boxplot visualization of the tuning values for perceived unisons as a function of pitch, categorized by instrument combination and dynamic level. The tuning value indicates the difference in fundamental frequency (in cents) when the adjustable tone and the reference tone are perceived to have the same pitch. Following a typical convention, boxes span an interval from 25th to 75th data quantiles with the median denoted by horizontal lines. The whiskers extend to 1.5 times the interquantile range, beyond which the individual data points are displayed as solid dots. The crosses indicate Terhardt’s virtual pitch values for the tone pairs where the calculated virtual pitch was within ±100 cents from the nominal fundamental frequency. Tone pairs in column A: contrabass clarinet – double bass; B: contrabass clarinet – tuba; C: contrabassoon – contrabass clarinet; D: contrabassoon – tuba; E: double bass – contrabassoon; F: double bass – tuba.

Furthermore, the 25th and 75th quantile ranges (boxes) typically cross the the zero line, which represents tuning to equal fundamental frequencies (0 cents). The results indicate that individual subjects perceived the pitch of the low-register tones with a relatively wide variation. Visual investigation of the instrument pairs suggests that a slightly more constant tendency could be found in the pairs with the contrabass clarinet and the tuba at pp level (Fig. 2B). The contrabass clarinet requires tuning upwards to match a unison with the tuba.

The pitch of the sawtooth tones (Fig. 3) was typically perceived higher than the pitch of the instrument in most of the cases. Similar to the stimulus pairs with instrument tones in Figure 2, the variability is high and of the same order of magnitude, increasing towards the lower musical pitches.

thumbnail Figure 3

Tuning values for all combinations of the sawtooth tone and instruments. Musical pitches (x-axis) are identical to Figure 2. Due to image size and readability, only A musical pitches (notes) have been labeled.

Whereas the mean tuning values across instruments and musical pitches appear random (Figs. 2 and 3), the degree of uncertainty in the tuning follows a more predictable curve. This effect is visualized in Figure 4 showing the mean absolute deviation in tuning values across all cases (instruments and sawtooth) and subjects for each of the nine musical pitches. The choice of mean absolute deviation (MAD) for describing the variability in tuning data, instead of the more commonly used standard deviation, was made in order to enable easier comparisons with other studies on pitch perception.

thumbnail Figure 4

Grand average of the mean absolute deviation in tuning values across all subjects (black line) as a function of musical pitch. The estimated 95% confidence interval using the Loess method is visualized with the shaded region. The mean absolute deviation curves of individual subjects are shown with light gray lines.

The data point for each musical pitch (thick line) reflects the spread of 930 tuning values (31 subjects × 10 stimuli pairs × 3 dynamic levels). Also here a higher variability in tuning for the lower musical pitches is evident. The mean absolute deviation increases from about 16 cents at A2 to 41 cents at A0. The calculated 95%-confidence interval is narrow, about ±3 cents.

The mean absolute deviations for each of the 31 subjects are included as separate curves, each data point representing 30 tuning values. The most reliable subjects showed mean absolute deviations between 7 and 27 cents, whereas the most uncertain subjects performed seemingly randomly with mean absolute deviations between 25 and 55 cents (excluding outliers). Since each stimulus pair was evaluated only once by each listener (no repeats), a formal analysis of the intrasubject variability was not possible (see Sect. 5 for discussion).

Regarding the possibility of improved tuning accuracy or evaluation by subjects who play a low-register instrument, the statistical dependency between performance by such groups was explored over all tones. This was evaluated with Welch’s two-sample t-test, which did not show a statistically significant difference between the subject groups (t(24) = −0.29, p = 0.77).

4.1 Spectral centroid

The correlation between the spectral centroid and mean tuning values was estimated for every combination of the instruments in the stimulus pair and for each of the nine musical pitches. Pearson’s correlation coefficients ranged from ρ = −0.09 (double bass and tuba) to ρ = 0.18 (double bass and contrabassoon). This outcome suggests that the spectral centroid is not a consistent and indicative measure of explaining the perceptual pitch difference between two complex tones.

4.2 Virtual pitch

The calculated virtual pitches showed no systematic correlation with the observed mean tuning values (see Fig. 2, virtual pitches marked by crosses). However, in a few cases (e.g., contrabassoon-contrabass clarinet pair in mf, see Fig. 2C), virtual pitch values follow the mean tuning values moderately well. Such effects were investigated statistically by calculating the correlations between mean tuning and virtual pitch for stimulus pairs. The highest correlation for the instrument pairs occurred between contrabass clarinet and tuba (ρ = 0.17). With the sawtooth tone included, the combination of sawtooth and contrabassoon produced ρ = 0.24. In all other cases, the correlation coefficients were lower. In short, no significant correlation between the mean tuning values and the virtual pitches could be observed.

4.3 Principal component analysis

The correlation analyses above did not reveal any consistent relationship between the mean tuning and overall spectral properties, represented by the spectral centroid and virtual pitch. A deepened analysis focused on differences in the harmonic structure of the compared tones as a possible cause of the large variability in the perceived pitches. “Difference spectra” were calculated and subjected to PCA, which decomposes the spectral differences into fewer common salient features. The difference spectra were calculated by subtracting the levels of the harmonics of the adjusted tone from the levels of the fixed reference tone on a decibel scale. An example of a difference spectrum is provided in Figure 5. The reference tone is the sawtooth waveform showing a 1/f magnitude spectrum. The adjusted instrument is the contrabass clarinet, which, similar to the clarinet, is characterized by attenuated second and fourth harmonics. The subtraction of the magnitude spectra results in the difference spectrum, shown in Figure 5B. Difference spectra spanning the first 10 harmonics were calculated for all included pairs of stimulus tones.

thumbnail Figure 5

Example of the calculation of the difference spectrum for a pair of tones in a stimuli. A: spectra of a sawtooth tone and a contrabass clarinet with musical pitch C2 at pp dynamic. B: difference spectrum.

As mentioned, the PCA analysis was based on a sharpened dataset consisting of stimulus pairs showing a mean tuning deviation from zero with statistical significance (see Sect. 3.5). The PCA solutions for the difference spectra were obtained separately for tone pairs without and with the sawtooth tone (see Fig. 6). The first component (PC1) was identified to characterize the emphasis of the lowest harmonics and very weak even-order harmonics. The next component (PC2) illustrates a combination of pronounced 2nd and 4th harmonics as well as reduced spectral balance at further harmonics. Notably, PCA gives this component an opposing polarity in analyses with and without sawtooth tone. PC3 describes the absence of the fundamental and harmonics 5–7, as well as the second harmonic in the analysis without sawtooth tone. Components with a strong contrast between odd and even harmonics can be associated with a contrabass clarinet-type tone often featuring attenuated even harmonics.

thumbnail Figure 6

PCA components of differences between the magnitudes of the 10 first spectrum peaks, eigenvalues and explained variances. Top: tone pairs with instruments; Bottom: tone pairs including sawtooth waveform. Sharpened dataset (statistically significant tuning deviance at p < 0.05).

Although the eigenvalues of PC4 and PC5 do not exceed unity, these remaining components are shown here in a residual sense.

For instrument tone pairs (Fig. 6, top row), the first two principal components together explain nearly half of the total variance of the spectral differences (28% and 21% respectively). The corresponding values for the sawtooth–instrument pairs were 24% and 23%, respectively.

The biplots in Figure 7 show the difference between instrument pairs in the PC1–PC2 dimensions. For improved visual clarity, the cloud of data points for individual instrument pairs are shown with confidence ellipses. The arrows represent the 5 harmonic difference components (HDC) individually. Their lengths and directions indicate the contribution to the corresponding PC dimensions.

thumbnail Figure 7

PCA biplots of the first two principal components with the sharpened dataset (statistically significant tuning deviation p < 0.05) A: tone pairs with instruments, B: tone pairs including sawtooth waveform. The clusters of data points are indicated with confidence ellipses. Pairs with contrabass clarinet are highlighted with thick line type. For correlations between HDC components (arrows) and the PC axes see Section 3.5.

Together, the confidence ellipses, their centers, and the directions of HDC arrows illustrate the variability of spectral differences between instrument pairs. The location of the center of the confidence ellipse describes the average composition of the spectral difference in terms of PC1 and PC2 associated with a certain instrument pair. The angle of the ellipse major axis characterizes the strongest variability of the spectral differences within the respective pair. The smallest confidence ellipses are found for instrument pairs where the spectral difference remains constant regardless of the fundamental frequency and dynamic level.

The key findings can be interpreted as follows: In Figure 7A, the pair contrabassoon-contrabass clarinet shows a strong average deviation from the origin towards positive PC1 and negative PC2 values. The relations to the HDC arrows for harmonics 1, 2, and 4 suggest that the reference instrument tone of the pair (cbsn) contains substantially more even harmonics but its fundamental is weaker. For this instrument pair, the overall variation in spectral differences across tones and dynamic levels includes both PC1 and PC2.

A seemingly contradictory result can be observed for tone pairs including the contrabass clarinet (bolded ellipses in Fig. 7A). Since the contrabass clarinet is included both as a reference tone (with double bass and tuba) and an adjustable tone (with contrabassoon), subtraction of the adjustable tone spectrum from the reference tone spectrum results in a partially opposing spectral difference for contrabass clarinet pairs (opposite signs of the harmonic magnitude differences). Hence, the slopes and directions from the origin of the corresponding confidence ellipses vary between pairs including the contrabass clarinet as a reference and adjustable tone, respectively.

Within the instrument group, the centers of all three ellipses including contrabass clarinet deviate from the origin more than other instrument pairs, which in turn are notably co-centric (Fig. 7A). This strongly suggests that the absence of even harmonics in the clarinet-type spectrum is the most differentiating single spectral feature.

The influence of the lowest even-order HDC, i.e., second and fourth harmonics, is prominent in both biplots in Figure 7 (without and with sawtooth tone pairs). This is particularly apparent in the pair with the sawtooth tone as a reference and the contrabass clarinet as an adjustable tone in Figure 7B. Here, HDC2 and HDC4 both point in the positive PC2 direction, suggesting that these harmonics were substantially stronger in the sawtooth tone than in the contrabass clarinet. The positive sign of HDC2 and HDC4 is reflected in the mean tuning values for sawtooth and contrabass clarinet in Figure 3 which all lie above or on the zero line. This indicates that the weak second and fourth harmonics in the contrabass clarinet contributed to a lowering of the perceived pitch compared to the sawtooth. As a consequence, the fundamental frequency of the contrabass clarinet has to be adjusted to a higher value to obtain a unison with the sawtooth.

For comparison, the analyses above were repeated on the entire unsharpened tuning data set. That is, all tone pairs were included regardless of the p-value of the mean tuning values. The PCA result remained generally similar to the sharpened dataset results, and the explained variances did not differ more than 1%.

Music dynamics had in general only a marginal effect on the perceived pitch differences according to the PCA, although the tuning values of different instrument combinations in Figures 2 and 3 showed some variations between dynamic levels.

As a result of the PCA, the salient harmonic difference components between the tone pairs were compared with the mean tuning values in Section 3.5. On the sharpened data set with only strong p-value tone pairs, PC2 showed the highest correlation with the mean tuning value for stimulus pairs including sawtooth as reference (ρ = 0.31). For stimulus pairs with two instruments, PC1 had the highest correlation with the mean tuning value (ρ = 0.24). The same PCs showed the strongest correlations with mean tuning across the full data set, but the correlation coefficients were lower (ρ = 0.14) for sawtooth pairs (PC2) and (ρ = 0.22) for instrument pairs (PC1).

Referring to the shape of the difference spectra in Figure 6, this result suggests that the absence of even harmonics has the most substantial influence on the lowering of the perceived pitch of low-register complex tones. Thus, the adjustable instrument tone with missing even harmonics (contrabass clarinet) was tuned higher to match the perceived pitch of a tone with a more uniform harmonic spectrum (see Figs. 2B2C and 3).

5 Discussion

A striking result of the listening test is the overall large spread in the tuning values and the large variations between subjects. Altogether, the results reflect a large uncertainty in the perception of pitch at low frequencies.

Most of the participants reported that tuning of stimulus pairs at higher fundamental frequencies (>50 Hz) was straightforward and easy. Despite this apparently easy task, a remarkably large spread in tuning adjustments was evident in the results. For the lowest musical pitches, i.e., A0 (27.6 Hz) and C1 (32.5 Hz), the variability was high due to difficulty to even understand which tone was presented, which indicates that the stimuli gave very weak cues to the perception of melodic pitch. This observation is consistent with the results of a study by Krumbholz et al. [33], where the lower limit for temporal processing of pitch was reported to be about 30 Hz.

The high variability between and within subjects does, however, not conceal the general trend in data, showing larger uncertainty in tuning towards lower musical pitches. This is evident from Figure 4. The grand average of the mean tuning values (black line), which is based on (30 × 31) 930 observations of each of the nine musical pitches, shows a narrow 95% confidence interval of only ±3 cents, approximately.

A formal analysis of the intrasubject variability in tuning was not possible to conduct as each stimulus pair was evaluated only once by each listener (no repeats). It is clear, however, that within the group of 31 professional musicians who participated in the study, there are large individual differences in the pitch perception and tuning accuracy. As seen in Figure 4 a few participants (about 5) had quite random and wide mean deviations. A majority of participants (about 15) were placed close to the average and a smaller group (about 10) performed considerably better. The best subjects lie consistently about 10 cents below the grand average line.

It is interesting to compare our results with previous studies on pitch discrimination, in particular measurements of just noticable difference in pitch (JND), usually reported as the corresponding difference in fundamental frequency. The great majority of studies of JNDs have been made using pure tone stimuli. The results are therefore not directly comparable with our experiments in which pairs of stimulus tones obtained from samples of musical instruments were compared. Published data in the low-frequency register are sparse, but a common result from previous studies is that JNDs worsened significantly towards the lowest musical pitches, in agreement with our results. The grand average curve of the absolute mean deviation in Figure 4 replicates the JND curves of previous studies, but it is shifted down about 30–60 cent compared to pure tone stimuli [37]. Using harmonic tones apparently improves the pitch discrimination.

In some previous experiments, pairs of complex harmonic tones which differ in a few spectral properties are compared, which still is far from our stimuli based on tones from musical instruments. In a recent study by Mehta and Oxenham [35], 12-harmonic complex tones were used as stimuli, which resembles our 13-harmonic sawtooth stimulus. They used stimuli where the three lowest harmonics were absent, which definitely worsened JND. The aim was to study the influence of listening to individual harmonics rather than to “overall” pitch defined by the periodicity of the tone. The deviation from our results in Figure 4 was large, especially in the lowest register where they reported JNDs up to 130 cent higher at 30 Hz (about A0). Their experiment is also relevant for studying the pitch perception of bass instruments with weak or missing fundamentals in the lowest register, for example, the double bass.

In our study, the participants were instructed to listen to the tones holistically without paying attention to individual harmonics. However, sometimes a few listeners may have been hypersensitive to some frequencies and that may cause unwanted emphasis on some harmonics, which for its part may slide the listener to a “wrong” (spectral) listening mode. This effect could result in intermittent outliers in otherwise more consistent data. However, according to Mehta and Oxenham [35], this listening option is not available for the lowest pitches as all harmonics are spectrally unresolved in the auditory periphery.

The contrabass clarinet was reported to be the most challenging stimulus. This opinion was supported by the PCA results which indicate that the contrabass clarinet evoked a pitch perception that was different from the other instruments. It has a deviating spectrum contour, showing alternating strong and weak harmonics and a strong fundamental frequency (Fig. 5). The PCA results showed that the strength of the second and fourth harmonics influenced the perceived pitch, as reflected in the mean tuning values. The contrabass clarinet required tuning upwards to match a unison with most other instruments and the sawtooth. That means that the pitch of the contrabass clarinet was perceived to be lower than other instruments when adjusted to the same fundamental frequency. The reason could be that the weak or almost lacking second and fourth harmonics (octave and double octave above the fundamental) may make the pitch perception more difficult.

Although a direct comparison of the mean values and variability in tuning values for different instrument combinations in Figures 2 and 3 did not indicate any particular difficulties in the pitch perception of the contrabass clarinet, the PCA analysis suggests that the harmonic structure with weak even low harmonics had a major influence on the pitch perception and tuning values for stimuli pairs including this instrument. Therefore, we could conclude that due to the spectral difference, the contrabass clarinet-like sound was perceived to have a lower pitch in the lowest register.

According to our earlier psychoacoustic study on the octave enlargement phenomenon [7], the general stretching curve is almost horizontal below A2 (110 Hz). However, in that study, the clarinet curve differed from other instruments and was more like a “J”-shape on its side, where the lowest register bends upwards. This effect may be explained by the findings in this study. If the pitch of the clarinet is perceived lower than would be expected from the periodicity of the tone, the pitch has to be adjusted upwards to achieve a perceptually correct octave interval.

Apparently, the influence of the relative strength of the harmonics on the perceived pitch is an important factor to consider. In this connection, it should be noted that since the sound pressure levels were equalized within pairs in our experiment, the prominence of the higher harmonics was relatively emphasized for instruments having weak or missing fundamental, like the double bass in the lowest octave.

Furthermore, due to the low sensitivity of the human ear in the low-frequency region [38], a weak fundamental may even have been completely inaudible for the lowest tones. Altogether, this may have influenced the perceived pitch.

Regarding the subjects’ ability to hear differences between presented spectra, the conventional audiogram does not reveal much about real sensitivity to individual harmonics. Between the sparse measurement points of the audiogram, narrow frequency bands may be differently sensitive, which can cause amplification of some harmonics and attenuation of others. This, in part, may also affect the perceived pitch and explain the relatively large intersubject difference as suggested by Morgan and Galambos [16]. If an audiogram with narrow frequency bands (e.g. one semitone steps instead of octaves or fifths) would be collected from participants, it may help in similar types of studies to find correlations between hearing sensitivity, perceived pitch, and spectral envelope.

We did not find any signs that the players of bass instruments would have performed better on the listening test. Probably, the accuracy of pitch perception in the low register has its limits which do not depend significantly on the listener’s training or background.

In the context of electronic tuning machines, it may be relevant to consider their usefulness, especially in the lowest register even though they are technically precise. This is an important question, particularly in the case of clarinet instruments, whose perceived pitch seems to be lower compared to other instruments.

6 Conclusions

The conducted listening experiment showed that the perceived pitch of low-register complex tones (derived from musical instrument samples) exhibits large variability and is highly dependent on the listener, spectrum, and dynamic level. Using 31 professional musicians as participants, the spread (mean absolute deviation) in the tuning of a melodic interval to unison, using complex tones with different spectra, increased continuously from 16 to 41 cents in the low-frequency range from 110 Hz (A2) to 27.6 Hz (A0). The result suggests that the participants were not able to determine an unambiguous reference tone over a considerable part of this frequency range. That is, towards the lowest register the uncertainty in the pitch judgments increased. However, it is debatable whether that is due to a reduced judgment ability (central) or inherent uncertainty in the information in the auditory nerve (peripheral). From a musical perspective, the result would imply that a melodic line in the bass register may be perceived as undefined, in particular in the lowest octave A0–A1.

The incongruity between the perceived pitch of the reference tones (using four different spectra) and Terhardt’s model of the virtual pitch was substantial.

A prominent result achieved by PCA was that the musicians perceived the pitch of tones derived from samples of the contrabass clarinet to be somewhat lower than the pitch of other bass instruments. The plausible reason for this slight pitch shift is that the second and fourth harmonics are attenuated in the contrabass clarinet spectrum.

Conflict of interest

Author declared no conflict of interests.

Data availability statement

The research data (see Sect. 3.2) associated with this article are available in Zenodo, under the references [31] and [32].

Acknowledgments

This research was supported by the Academy of Finland (Project No. 289300), Niilo Helander Foundation, and Alfred Kordelin Foundation. Open access funded by Helsinki University Library.

References

  1. H. Fletcher: Loudness, pitch, and the timbre of musical tones and their relation to the intensity, the frequency, and the overtone structure. The Journal of the Acoustical Society of America 6, 2 (1934) 59–69. [Google Scholar]
  2. H.B. Wallace: Musical instrument tuner. U.S. Patent 7 285 710, Oct 23, 2007. [Google Scholar]
  3. J. Schouten, R. Ritsma, B. Lopes Cardozo: Pitch of residue. The Journal of the Acoustical Society of America 34, 8 (1962) 1418–1424. [Google Scholar]
  4. E. Terhardt: Pitch, consonance, and harmony. The Journal of the Acoustical Society of America 55, 5 (1974) 1061–1069. [Google Scholar]
  5. E. Terhardt, G. Stoll, M. Seewann: Pitch of complex signals according to virtual-pitch theory: Tests, examples, and predictions. The Journal of the Acoustical Society of America 71, 3 (1982) 671–678. [Google Scholar]
  6. I. Nelken: Processing of complex sounds in the auditory system. Current Opinion in Neurobiology 18, 4 (2008) 413–417. [Google Scholar]
  7. J. Jaatinen, J. Pätynen, K. Alho: Octave stretching phenomenon with complex tones of orchestral instruments. The Journal of the Acoustical Society of America 146, 5 (2019) 3203–3214. [Google Scholar]
  8. P.G. Singh, I.J. Hirsh: Influence of spectral locus and F0 changes on the pitch and timbre of complex tones. The Journal of the Acoustical Society of America 92, 5 (1992) 2650–2661. [Google Scholar]
  9. F. Russo, W. Thompson: An interval size illusion: The influence of timbre on the perceived size of melodic intervals. Perception and Psychophysics 67, 4 (2005) 559–568. [Google Scholar]
  10. A. Vurma, J. Ross: Timbre-induced pitch deviations of musical sounds. Journal of Interdisciplinary Music Studies 1, 1 (2007) 33–50. [Google Scholar]
  11. A. Vurma, M. Raju, A. Kuuda: Does timbre affect pitch?: Estimations by musicians and non-musicians. Psychology of Music 39, 3 (2011) 291–306. [Google Scholar]
  12. S. Wold, K. Esbensen, P. Geladi: Principal component analysis. Chemometrics and Intelligent Laboratory Systems 2, 1–3 (1987) 37–52. [Google Scholar]
  13. E. Terhardt, G. Stoll, M. Seewann: Algorithm for extraction of pitch and pitch salience from complex tonal signals. The Journal of the Acoustical Society of America 71, 3 (1982) 679–688. [Google Scholar]
  14. S.S. Stevens: The relation of pitch to intensity. The Journal of the Acoustical Society of America 6, 3 (1935) 10. [Google Scholar]
  15. W.B. Snow: Change of pitch with loudness at low frequencies. The Journal of the Acoustical Society of America 8, 1 (1936) 14–19. [Google Scholar]
  16. C.T. Morgan, R. Galambos: A reinvestigation of the relation between pitch and intensity. The Journal of the Acoustical Society of America 15, 1 (1943) 77. [Google Scholar]
  17. A. Cohen: Further investigation of the effects of intensity upon the pitch of pure tones. The Journal of the Acoustical Society of America 33, 10 (1961) 1363–1376. [Google Scholar]
  18. D. Cabrera: AARAE 9, a Matlab-based measurement, processing, and analysis environment for audio and acoustic system responses. 2017. https://github.com/densilcabrera/aarae. Online: Accessed 23-Oct-2020. [Google Scholar]
  19. J.M. Grey, J.W. Gordon: Perceptual effects of spectral modifications on musical timbres. The Journal of the Acoustical Society of America 63, 5 (1978) 1493. [Google Scholar]
  20. R.J. Ritsma: Frequencies dominant in the perception of the pitch of complex sounds. The Journal of the Acoustical Society of America 42, 1 (1967) 191–198. [Google Scholar]
  21. B.C.J. Moore, B.R. Glasberg, R.W. Peters: Relative dominance of individual partials in determining the pitch of complex tones. The Journal of the Acoustical Society of America 77, 5 (1985) 1853–1860. [Google Scholar]
  22. H.M. Jackson, B.C.J. Moore: The dominant region for the pitch of complex tones with low fundamental frequencies. The Journal of the Acoustical Society of America 134, 2 (2013) 1193–1204. [Google Scholar]
  23. H. Dai: On the relative influence of individual harmonics on pitch judgment. The Journal of the Acoustical Society of America 107, 2 (2000) 953–959. [Google Scholar]
  24. G.M. Bidelman: Subcortical sources dominate the neuroelectric auditory frequency-following response to speech. NeuroImage 175, May (2018) 56–69. [Google Scholar]
  25. R. Batra, K. Shigeyuki, V.L. Maher: The frequency-following tones to continuous tones in humans. Hearing Research 21 (1986) 167–177. [Google Scholar]
  26. T. Lu, X. Wang: Temporal discharge patterns evoked by rapid sequences of wide- and narrowband clicks in the primary auditory cortex of cat. Journal of Neurophysiology 84, 1 (2000) 236–246. [Google Scholar]
  27. C.J. Plack, D. Barker, D.A. Hall: Pitch coding and pitch processing in the human brain. Hearing Research 307 (2014) 53–64. [Google Scholar]
  28. V. De Angelis, F. De Martino, M. Moerel, R. Santoro, L. Hausfeld, E. Formisano: Cortical processing of pitch: Model-based encoding and decoding of auditory fMRI responses to real-life sounds. NeuroImage 180, March (2018) 291–300. [Google Scholar]
  29. A. Gerson, J.L. Goldstein: Evidence for a general template in central optimal processing for pitch of complex tones. The Journal of the Acoustical Society of America 63, 2 (1978) 498–510. [Google Scholar]
  30. A. Kohlrausch, A. Houtsma: Pitch related to spectral edges of broadband signals. Philosophical Transactions of the Royal Society 336, 1278 (1992) 375–382. [Google Scholar]
  31. J. Jaatinen, J. Pätynen, T. Lokki: Uncertainty in tuning evaluation with low-register complex tones of orchestra instruments. “Demonstration signals for low-note experiment”. Dataset (Version v1.0) [Data set]. Zenodo, 2021. https://doi.org/10.5281/zenodo.4697590. [Google Scholar]
  32. J. Jaatinen, J. Pätynen, T. Lokki: Uncertainty in tuning evaluation with low-register complex tones of orchestra instruments. “Demonstration figures for low-note experiment”. Dataset (Version v1.0) [Data set]. Zenodo, 2021. https://doi.org/10.5281/zenodo.4697596. [Google Scholar]
  33. K. Krumbholz, R.D. Patterson, D. Pressnitzer: The lower limit of pitch as determined by rate discrimination. The Journal of the Acoustical Society of America 108, 3 (2000) 1170–1180. [Google Scholar]
  34. T. Rogala, A. Miśkiewicz, P. Rogowski: Identification of harmonic musical intervals: The effect of pitch register and tone duration. Archives of Acoustics Dec., 4 (2017) 591–600. [Google Scholar]
  35. A.H. Mehta, A.J. Oxenham: Effect of lowest harmonic rank on fundamental-frequency difference limens varies with fundamental frequency. The Journal of the Acoustical Society of America 147, 4 (2020) 2314–2322. [Google Scholar]
  36. S. Lê, J. Josse, F. Husson: FactoMineR: An R package for multivariate analysis. Journal of Statistical Software 25, 1 (2008) 1–18. [Google Scholar]
  37. A. Miśkiewicz: Scientific legacy of professor Andrzej Rakowski in current studies of pitch discrimination in music. Vibrations in physical systems 30, 2019113 (2019) 1–8. [Google Scholar]
  38. ISO: 226: 2003: Acoustics-Normal equal-loudness-level contours. International Organization for Standardization 63, 2003. [Google Scholar]

Cite this article as: Jaatinen J. Pätynen J. & Lokki T. 2021. Uncertainty in tuning evaluation with low-register complex tones of orchestra instruments. Acta Acustica, 5, 49.

All Figures

thumbnail Figure 1

Boxplot visualization of the grand averages of the tuning values across all instrument pairs and dynamic levels (i.e., sawtooth omitted). The tuning value indicates the difference in fundamental frequency (in cents) when the adjustable tone and the reference tone are perceived to have the same pitch. Horizontal axis indicates musical pitches.

In the text
thumbnail Figure 2

Boxplot visualization of the tuning values for perceived unisons as a function of pitch, categorized by instrument combination and dynamic level. The tuning value indicates the difference in fundamental frequency (in cents) when the adjustable tone and the reference tone are perceived to have the same pitch. Following a typical convention, boxes span an interval from 25th to 75th data quantiles with the median denoted by horizontal lines. The whiskers extend to 1.5 times the interquantile range, beyond which the individual data points are displayed as solid dots. The crosses indicate Terhardt’s virtual pitch values for the tone pairs where the calculated virtual pitch was within ±100 cents from the nominal fundamental frequency. Tone pairs in column A: contrabass clarinet – double bass; B: contrabass clarinet – tuba; C: contrabassoon – contrabass clarinet; D: contrabassoon – tuba; E: double bass – contrabassoon; F: double bass – tuba.

In the text
thumbnail Figure 3

Tuning values for all combinations of the sawtooth tone and instruments. Musical pitches (x-axis) are identical to Figure 2. Due to image size and readability, only A musical pitches (notes) have been labeled.

In the text
thumbnail Figure 4

Grand average of the mean absolute deviation in tuning values across all subjects (black line) as a function of musical pitch. The estimated 95% confidence interval using the Loess method is visualized with the shaded region. The mean absolute deviation curves of individual subjects are shown with light gray lines.

In the text
thumbnail Figure 5

Example of the calculation of the difference spectrum for a pair of tones in a stimuli. A: spectra of a sawtooth tone and a contrabass clarinet with musical pitch C2 at pp dynamic. B: difference spectrum.

In the text
thumbnail Figure 6

PCA components of differences between the magnitudes of the 10 first spectrum peaks, eigenvalues and explained variances. Top: tone pairs with instruments; Bottom: tone pairs including sawtooth waveform. Sharpened dataset (statistically significant tuning deviance at p < 0.05).

In the text
thumbnail Figure 7

PCA biplots of the first two principal components with the sharpened dataset (statistically significant tuning deviation p < 0.05) A: tone pairs with instruments, B: tone pairs including sawtooth waveform. The clusters of data points are indicated with confidence ellipses. Pairs with contrabass clarinet are highlighted with thick line type. For correlations between HDC components (arrows) and the PC axes see Section 3.5.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.