Open Access
Issue
Acta Acust.
Volume 5, 2021
Article Number 1
Number of page(s) 9
Section Hearing, Audiology and Psychoacoustics
DOI https://doi.org/10.1051/aacus/2020027
Published online 16 December 2020

© F. Wendt & R. Höldrich, Published by EDP Sciences, 2021

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

The precedence effect refers to a group of perceptual phenomena which allow us to localize sound sources in challenging reverberant environments. Research has shown that the auditory system differently weights the spatial cues over the stimulus duration; the sound source direction is determined mainly by the onset of the stimulus, where localization cues of the direct sound prevail. Corresponding studies typically utilize a two-source paradigm, representing a leading direct sound and a lagging specular reflection at the same intensity. The parameter of interest for this setup is the lead-lag delay. For very short delays between direct sound and reflection below 1 ms, summing localization occurs and the two sounds appear as a single fused image located between the sound instances. As the delay slightly exceeds the limits for summing localization the precedence effect becomes active. In this range the auditory system suppresses localization cues carried by reflections and the leading direct sound dominates the perceived location of a single auditory image. Another increase of the delay yields widening of the auditory image until it breaks apart at the upper limit of echo suppression. At this transition, the so-called echo threshold the reflection becomes audible as a second auditory image. This threshold is especially relevant because it can be used to determine the strength of precedence; increased echo threshold delays imply a stronger echo suppression and consequently a stronger precedence effect.

Many investigations have contributed to our understanding of the precedence effect by measuring the echo threshold with reflections that are identical to the direct sound (see [1, 2] for reviews). However, such specular reflections represent rather artificial conditions, which require an infinitely large, flat, rigid reflecting surface [3]. When the sound encounters a real-life wall, it is inevitably redirected also into angles other than the specular reflection angle. There are only a handful of contributions which studied the precedence effect in more realistic scenarios using wall panels for creating the reflection, e.g. [46].

This contribution examines how reflection properties of a wall influence the echo suppression for pulsed noise signals. First, an overview on the relevant literature is given in Section 2. Subsequently, Section 3 presents a simulation for diffuse reflection responses based Lambertian reflection. The simulation considers both temporal and directional diffusion of the reflected sound by incorporating the scattering coefficient. The perceptual influences of diffuse reflections are studied in a listening experiment. Section 4 outlines the experimental design for measuring the echo threshold and the masked threshold. The experimental results are discussed in Section 5. Modeling attempts of the results of both thresholds are presented in Section 6. Section 7 relates the findings to previous studies and Section 8 summarizes the contribution.

2 Echo suppression for non-specular reflections

Specular reflections represent one of two extreme conditions which have been identified for rigid surfaces. When the direct sound of a source is reflected specularly, both magnitude spectrum and phase spectrum are maintained and the sound bounces off the surface with the same angle as it encountered the surface. The other extreme condition is the diffuse reflection that occurs when the reflected energy is scattered [3, 7]. It yields a reflected sound that spreads directionally and temporally.

In room acoustics, reflections on rigid walls are typically approximated by dividing the reflected energy into two components, specular and diffuse, with the scattering coefficient s defining the ratio of the scattered energy to the total energy reflected by the surface. Thus, the scattering coefficient is defined in the interval s = [0, 1] and a value of 1 corresponds to a fully diffuse reflection with no specular reflection component.

While many psychoacoustic experiments have contributed to our understanding of the precedence effect with specular reflections, knowledge on the influence of spatio-temporal diffuse reflections is still rough and partly contradictory. The temporal diffusion for example yields spectral colorations of the reflection. According to Blauert and Dinveyi [8] suppression occurs for any natural reflection where the spectrum of the lag contains exclusively those regions that are present in the spectrum of the lead. However, for reflections differing in magnitude spectrum and phase spectrum, the suppression was found to be generally weaker compared to a specular reflection, e.g. [810].

Robinson et al. [11] studied the influence of temporal diffusion on the echo threshold delay by modeling visual observations of measured reflection responses from diffusors. Their simulation approximates the impulse response of diffuse reflections by multiplying Gaussian white noise with the envelope describing the probability density function of a gamma distribution. Resulting echo threshold delays, defined by the temporal energy centroid of impulse responses, revealed mostly no difference between the suppression of compact and a temporal diffuse reflections.

Grosse et al. [12] studied the perceptual influence of directional diffusion of either direct sound or reflection by presenting the sound from a single loudspeaker or by presenting mutually uncorrelated versions of similar sounds from nine adjacent loudspeakers. Although no pure specular lead/lag condition was tested in their experiments, there is evidence that the directional diffusion of the reflection might reinforce the echo suppression and for both speech and noise bursts, the echo threshold delay is longer if diffuse reflections are used.

Visentin et al. [13] recently studied the influence of an early reflection on spatial attributes by employing real flat and diffusive wall panels. They reported that the presence of a single diffuse reflection reduces the perceived distance of a frontal speech source, makes it clearer, and increases its intelligibility.

Lokki et al. [14] studied the qualitative influence of the temporal diffusion of reflections and found significant perceptual differences. In contradiction to [12] and [13], they assumed that the clear and open acoustics reported for specular reflections is due to stronger precedence, compared to temporal diffuse reflections that render the sound weak and muddy.

3 Simulating diffuse reflections

A specular reflection occurs when a plane wave is reflected on an infinitely large, smooth, and rigid wall. Snell’s law for the specular reflection states that angle of reflection is equal to the angle of incidence. The impulse response of such a reflection is described by a Dirac delta distribution δ(tT r ) with the delay T r calculated from the distance between the corresponding image source and the receiver.

For simulating a diffuse reflection, suppose the point source S in Figure 1 emits a short power impulse at time t = 0, represented by a Dirac delta function δ(t) with unity energy. The sound power dP W (t) reaching the wall element dW by the angle of incidence θ S and distance r S is then defined by,

d P W ( t ) = cos θ S 4 π r S 2 δ ( t - t S ) d W , $$ \mathrm{d}{P}_W(t)=\frac{\mathrm{cos}{\theta }_S}{4\pi \enspace {r}_S^2}\enspace \delta (t-{t}_S)\enspace \mathrm{d}W, $$(1)

with r S = x 2 + z 2 + d S 2 $ {r}_S=\sqrt{{x}^2+{z}^2+{d}_S^2}$, time t S = r S /c, and speed of sound c. Diffuse reflections are typically interpreted by assuming Lambertian reflection in which the scattered sound power from the element dW is proportional to the cosine of the angle of reflection cos θ R [3]. The intensity dI W reflected by the wall element dW of a perfectly reflective wall without absorption that reaches the receiver R is defined by,

d I W ( t ) = cos θ S cos θ R 4 π r S 2 r R 2 δ ( t - t SR ) d W , $$ \mathrm{d}{I}_W(t)=\frac{\mathrm{cos}{\theta }_S\enspace \mathrm{cos}{\theta }_R}{4\pi \enspace {r}_S^2\enspace {r}_R^2}\enspace \delta \left(t-{t}_{{SR}}\right)\mathrm{d}W, $$(2)

with r R = x 2 + z 2 + d R 2 $ {r}_R=\sqrt{{x}^2+{z}^2+{d}_R^2}$ and t SR = (r S + r R )/c. Assuming the reflected sounds of wall elements dW to be incoherent, the overall intensity I diff reaching the receiver R is obtained by the integration over both dimensions of the wall,

I diff ( t ) = W cos θ S cos θ R 4 π r S 2 r R 2 δ ( t - t SR ) d x d z . $$ {I}_{\mathrm{diff}}(t)=\underset{W}{\iint } \frac{\mathrm{cos}{\theta }_S\enspace \mathrm{cos}{\theta }_R}{4\pi \enspace {r}_S^2\enspace {r}_R^2}\enspace \delta (t-{t}_{{SR}})\enspace \mathrm{d}x\mathrm{d}z. $$(3)

thumbnail Figure 1

Schematic geometry of a reflection on the xz-plane. The basic constellation consists of a source S, a receiver R, and a wall W.

The two cosines can be replaced by distance relations cos θ S = d S /r S and cos θ R = d R /r R , yielding the elliptical integral,

I diff ( t ) = W d S d R 4 π r S 3 ( x , z ) r R 3 ( x , z ) δ ( t - t SR ( x , z ) ) d x d z , $$ {I}_{\mathrm{diff}}(t)=\underset{W}{\iint } \frac{{d}_S\enspace {d}_R}{4\pi \enspace {r}_S^3(x,z)\enspace {r}_R^3(x,z)}\enspace \delta (t-{t}_{{SR}}(x,z))\enspace \mathrm{d}x\mathrm{d}z, $$(4)

which we solve numerically.

Figure 2 shows the reflected intensities dI W reaching the receiver R and directionally spreading around the specular direction. It is easy to see that the size of the wall influences the amount of the reflected energy. The overall intensity I diff for two different wall sizes is depicted in Figure 3 (setup A, cf. Table 1) and represents the temporal spread. The dimension of the rectangular finite-sized wall is chosen in a way that compared to an infinite wall approximately 90% of the energy is reflected. For a better comparability, intensity envelopes are shifted in time by T d = d/c. In this way, the direct sound of all conditions reaches the receiver at t = 0 ms, whereas corresponding reflections start at ΔT = T r T d = 20 ms.

thumbnail Figure 2

Directional spread of diffusely reflected intensities dI W normalized to the maximum and coded in gray-scale. The sound propagation path of a specular reflection is indicated as solid line.

thumbnail Figure 3

Intensity envelopes of the impulse response of condition A 20 consisting of direct sound an diffuse reflection. The temporal spreads for diffusely reflected intensities I diff for an infinite wall and the finite-sized wall W are normalized to the direct sound. The exponential model proposed in [15] is fitted to the early decay.

Table 1

Setup parameters of the conditions under investigation.

In [15] the temporal structure of diffuse reflections is modeled based on the rough surface theory [16, 17]. The energy envelope of their simulated diffuse reflection is an exponential function e t/τ with the decay coefficient τ found by fitting a line to the Schroeder-integrated energy curve corresponding to the measured impulse response. Fitting exponential functions to our simulation with decay coefficients based on the early decay yields an envelope decay of e t/τ with τ = ΔT/1.33. Figure 3 shows a neat overlapping of the early parts of the reflective impulse responses and the exponential functions comprise about 95% of the overall energy reflected by an infinite wall.

The simulated reflection response of our diffusely-reflecting wall has an instant onset at ΔT followed by an exponential decay. Although its shape complies with other simulations, it is different from measured impulse responses of conventional diffusers. The impulse response of a Schroeder diffuser [18] for example exhibits a damped onset and reaches its peak amplitude several milliseconds later, e.g. [19]. This is because of its specific surface structure, which in addition to the diffusion influences also the delay ΔT as the specular path is not necessarily the shortest. The aim of this study is to reveal perceptual differences between specular and diffuse reflections rather than the simulation of a real-world diffuser, which is why it considers only reflections simulated with a plane wall where delay times ΔT are well defined.

4 Experimental setup

In a constellation of sound source, receiver, and reflective wall the distance traveled by the reflection determines not just the delay ΔT, but also the inverse-square law intensity difference ΔL of the reflection compared to the direct sound. As these parameters are closely linked together, alternatively to the lead-lag delay ΔT, some studies define the echo threshold as level difference ΔL between leading direct sound and lagging specular reflection, e.g. [20, 21]. Thus, a stronger echo suppression is implied by a longer echo threshold delay or a higher echo threshold level.

In contrast to specular refections, the intensity envelope of a diffuse reflection response obtained by our simulation varies with the setup and thus also with the delay. Therefore, varying the delay alone is not straightforward, which is why the experiment examines the reflection’s level for a fixed delay. The echo threshold level is defined as the minimum level ΔL E of the lagging sound compared to the direct sound at which it is possible to detect a second auditory event. Even after the level of a lagging sound is reduced to the point where an echo is no longer perceptible, the presence of the reflection is still audible. In addition to the echo threshold level, the experiment determines the masked threshold level defined as the minimum level ΔL M at which it is possible to detect that a lagging sound is present at all.

4.1 Stimuli and procedure

The excitation signal for all conditions consisted of 50-ms Gaussian noise bursts with instant onset and offset.

The echo threshold was measured by a method of adjustment [21]. The actual stimulus was looped with 200-ms-long silence between each repetition. The listener was given control over the level of the reflection with the instruction to reduce its level down to the point where it is as faint as possible while still remaining audible as a second auditory event. To ensure that the reflection was audible as an echo at the beginning of each trial, the level of the lag was 10 dB above that of the direct sound. Since the echo suppression requires a buildup time to become fully effective [22], the amplitude of the ongoing sequence (consisting of direct sound and reflection) was linearly faded in by 2 s at the beginning of each trial. This time constant was determined by the authors by informal listening. Control of the level of the lag was enabled after this time.

The masked threshold was measured by a two-alternative forced choice adaptive procedure. Two intervals with a pause of 1 s between the intervals were presented consecutively. The direct sound of each interval consisted of four noise bursts with 200-ms-long silence between the bursts. In one randomly chosen interval the reflection was added, whereas in the other interval it was absent. The listeners’ task was to specify the interval containing the reflection. Possible loudness cues were removed by roving the level of both intervals, and feedback was given after each response. During a run, the level of the reflection was adjusted in a 3-down 1-up rule [23] estimating the 79.7% point on the psychometric function. At the beginning of each run direct sound and reflection were equally loud with a step size of 10 dB. After two reversals the step size was decreased to 5 dB and set to a final value of 2 dB after another two reversals. Using the 2 dB step size 6 more reversals were obtained, and according to [24] the threshold value is calculated by averaging over all levels obtained by final step size.

4.2 Conditions

Directional and temporal distribution of diffuse reflections vary with the setup of the geometric constellation, i.e., source-to-receiver distance and respective distances to the wall. In the experiment three different setups A, B, C were tested; Figure 4 schematically shows the lateral wall simulated with setup A and B, and the frontal wall of setup C. Respective conditions derived from the setups are listed in Table 1 with the receiver R facing always the point source S at equal height z. Note that the conditions’ subscript numbers indicate the corresponding delay ΔT in ms and the superscripts indicate the scattering coefficient s. To study the influence of the delay ΔT on specular (s = 0) and diffuse (s = 1) reflections while keeping the angle θ 0 = 45° constant, conditions A Δ T 0 $ {A}_{\Delta T}^0$ and A Δ T 1 $ {A}_{\Delta T}^1$ with ΔT = (5, 10, 20, 30) ms were tested for both suppression thresholds. The influence of the angle of reflection θ 0 on suppression thresholds was tested by conditions A 1, B 1, C 1 while keeping the delay ΔT = 20 ms constant. Moreover, to study the influence of the scattering coefficient, condition A 20 s $ {A}_{20}^s$ with s = 0.5 was included in the test. The condition A 20 $ {A}_{20}^{\star }$ is a hybrid of specular and diffuse reflection, and examined the influence of the directional spread of the diffuse reflection. Compared to fully-diffuse A 20 1 $ {A}_{20}^1$, the diffuseness of condition A 20 $ {A}_{20}^{\star }$ considers only the monaural-temporal spread but not the directional spread. Instead, similar to A 20 0 $ {A}_{20}^0$, the reflection was played back only from the (specular) direction θ 0.

thumbnail Figure 4

Schematic representation of the constellation of the reflective wall for setups that were examined in the experiment. For lateral-wall setups A and B (black) sound paths of direct sound and specular reflection, and specular reflection angle θ 0 and wall angle θ W are indicated. For setup C (gray) both direct sound and reflection are perpendicular to the wall.

For the numerical solution of Equation (3) the size of the wall W was chosen in a way that approximately 90% of the energy reflected by an infinite wall is taken into account by a uniform grid consisting of quadratic patches w. Computed intensities dI W were encoded into Ambisonics (order N = 17) and decoded on a spherical t-design (degree t = 35) using max- r E weighting [25] resulting in 632 envelope signals describing the energy distribution of a diffuse reflection. The corresponding impulse responses were obtained by multiplying white Gaussian noise with the square roots of intensity envelopes. An iterative whitening procedure similar to the Hilbert transform approach described by [26] was applied to each reflection waveform to avoid coloration of the diffusely reflected sound. Finally, impulse responses were normalized to the total energy. Impulse responses of specular reflections were represented by a Dirac delta distribution, and thus specular reflections are an exact copy of the direct sound.

All testing was conducted over headphones. The excitation signal (i.e., the noise burst) was convolved with respective impulse responses and binaural stimuli were created by convolving them with corresponding HRTF measurements1 of the Neumann KU100 dummy head. For constructing the stimuli Matlab was used. Playback employed PureData and the output was fed through a M-Audio MobilePre sound card to Beyerdynamics DT770 headphones. The playback level of the direct sound was fixed at 70 dB(A).

To prevent listeners fatigue, the listening experiment was performed in three runs. In the first run all echo conditions A ( 5,10,20,30 ) 1 + A ( 5,10,20,30 ) 0 + A 20 0.5 + B 20 1 + C 20 1 + A 20 = 12 $ {A}_{(\mathrm{5,10,20,30})}^1+{A}_{(\mathrm{5,10,20,30})}^0+{A}_{20}^{0.5}+{B}_{20}^1+{C}_{20}^1+{A}_{20}^{\star }=12$ were tested twice yielding 24 adjustment tasks. The second run included masked conditions with ΔT = 20 ms, which were tested once yielding A 20 0 + A 20 0.5 + A 20 1 + B 20 1 + C 20 1 + A 20 = 5 $ {A}_{20}^0+{A}_{20}^{0.5}+{A}_{20}^1+{B}_{20}^1+{C}_{20}^1+{A}_{20}^{\star }=5$ adaptive tasks. The last run consisted of masked conditions for specular and diffuse reflections A ( 5,10,20,30 ) 0 + A ( 5,10,20,30 ) 1 = 8 $ {A}_{(\mathrm{5,10,20,30})}^0+{A}_{(\mathrm{5,10,20,30})}^1=8$. In this way masked conditions A 20 0 $ {A}_{20}^0$ and A 20 1 $ {A}_{20}^1$ were tested twice, once in the context of the refection properties (second run), and once in the context of the delay ΔT (third run). Conditions within each run were tested in individual random order.

5 Experimental results

Twelve listeners (age 26–55 years) participated in the experiment. All of them were experienced listeners and reported normal hearing acuity.

5.1 Influence of scattering

Figure 5 shows echo threshold levels ΔL E and masked threshold levels ΔL M examined with setup A as a function of the delay ΔT. Unsurprisingly, masked threshold levels are below echo threshold levels with differences agreeing with literature [21]. Both levels decrease progressively with increasing ΔT with a progression of means resembling those obtained by similar studies, e.g. [27]. The size of corresponding 95% confidence intervals provides evidence that listeners performed similarly for both reflection types of each experiment. A reason for the overall smaller confidence intervals of the echo threshold is due the fact that corresponding conditions were tested twice.

thumbnail Figure 5

Means and 95% confidence intervals of echo threshold levels (filled symbols) and masked threshold levels (open symbols) for specular reflections (s = 0) and diffuse reflections (s = 1) over delays ΔT tested with setup A.

Statistics using the two-way repeated measures analysis of variance (RM ANOVA) reveal the threshold type, the delay, and the reflection type to be significant parameters (p ≤ 0.05). Interestingly, diffusion weakens the echo suppression and diffuse reflections are more easily detectable. For all conditions the diffuse threshold level is below the corresponding specular threshold level. A Tukey HSD post hoc analysis of both reflections types shows conditions A (20,30) of the echo threshold and all conditions of the masked threshold to be significantly different (p ≤ 0.05) with effect sizes expressed as Cohen’s d ≥ 0.9, i.e., large effects [28].

In addition to specular (s = 0) and diffuse (s = 1) reflections, condition A 20 was tested with a scattering coefficient of s = 0.5. Respective results are given in Figure 6. For both suppression types, mean levels of A 0.5 are in between corresponding means of extreme conditions A 0 and A 1 and a two-way RM ANOVA reveals the scattering coefficient to be a significant parameter (p ≤ 0.05). Note that masked threshold levels, obtained in the second run, are consistently higher than respective levels from the third run given in Figure 5 (p > 0.05).

thumbnail Figure 6

Results of the echo threshold level (filled symbols) and the masked threshold level (open symbols) given as means and 95% confidence intervals. The influence of the scattering coefficient s is examined with conditions A s , the influence of the angle θ 0 is examined with conditions A 1, B 1, C 1, and the influence of the directional spread is examined with the hybrid condition A , which combines the temporal characteristics of a diffuse reflection with the directional characteristics of a specular reflection. For all conditions the delay is constant with ΔT = 20 ms.

5.2 Influence of directional spread and spatial separation

The influence of directional spread of the diffuse reflection on the echo suppression is examined with the hybrid condition A , cf. Figure 6. This condition combines the monaural-temporal characteristics of a diffuse reflection with the binaural-temporal characteristics of a specular reflection. Mean thresholds levels are below mean levels of the corresponding directionally-spreading condition A 1 and paired sample t-tests reveal significant differences of A and A 1 at least for the echo threshold (p ≤ 0.05; Cohen’s d = 0.99, large effect [28]). It thus can be concluded that the temporal and directional spread have an opposite effect on the perception; compared to fully diffuse reflections, reflections spreading only temporally but not directionally are more easily detectable and thus weaken the echo suppression. Conversely, this means that the directional diffusion increases the suppression, making reflections more difficult detectable.

For specular reflections studies could show that the suppression is higher if direct sound and reflection arise from similar directions than when they are spatially separated [11, 27, 29]. The influence of spatial separation is examined by diffuse conditions A 20, B 20, C 20. Echo and masked threshold levels of condition B 20 are not different from A 20 (p > 0.05, Tukey HSD post hoc analysis) suggesting that the influence of spatial separation is not applicable for diffuse reflections that spread directionally and temporally. However, corresponding reflection responses depicted in Figure 7 reveal a higher temporal spread of condition A 20, which might compensate any effect of spatial separation. In contrast, conditions B 20 and C 20 exhibit a similar temporal spread, and significant increases of echo and masked thresholds levels are seen for the decreased spatial separation of direct sound and reflection of condition C 20 (p ≤ 0.05, Tukey HSD post hoc analysis; Cohen’s d ≥ 1.0).

thumbnail Figure 7

Diffusely reflected intensity I diff arriving at the receiver R of conditions A 1, B 1, and C 1 with the finite sized wall W, normalized to their maximums and shifted in time by T d .

6 Modeling the results

6.1 Echo threshold

Following the approach of Rakerd et al. [21] a linear model is used to describe the echo threshold level ΔL E as a function of the delay ΔT,

Δ L E = α + β Δ T , $$ \Delta {L}_{\mathrm{E}}=\alpha +\beta \cdot \Delta T, $$(5)

with the intercept α in dB and the slope β in dB/ms. Modeling individual echo thresholds with the regression fit given in Equation (5) reveals the differences of obtained slopes of both reflection types to be significantly different form zero (t-test: p ≤ 0.05). In other words, the slope of the specular regression line is different from the slope of the diffuse regression line. The differences of individual intercepts of specular and diffuse threshold on the other hand are not significant (p > 0.05), and we conjecture that the temporal spread of diffuse reflections results in an increase of the effective delay.

To account for different scattering coefficients s in a single model, we introduce a simple predictor of the effective delay ΔT E replacing ΔT in Equation (5) by,

Δ T E = Δ T ( 1 + s k E ) . $$ \Delta {T}_{\mathrm{E}}=\Delta T\enspace (1+s\cdot {k}_{\mathrm{E}}). $$(6)

This yields a temporal alignment of the results with the alignment parameter k E. The optimal alignment parameter is obtained by pooling the data of setup A for s = (0, 0.5, 1) with corresponding effective delays of Equation (6), and we choose the alignment parameter k E to maximize the coefficient of determination of the model in Equation (5). The optimal parameter is found at k E opt $ {k}_{\mathrm{E}}^{\mathrm{opt}}$ = 0.48 yielding α = −9.5 dB and β = −0.47 dB/ms with a coefficient of determination R 2 = 0.99 of the regression with corresponding means (95% confidence intervals: k E = [0.38, 0.55], α = [−10.8, −8.1] dB; β = [−0.53, −0.41] dB/ms).

In [11] the echo threshold of specular and temporally diffuse reflections is examined by a variation of the delay. Their definition of the delay, i.e., the time between the direct sound and the centroid of energy of the reflection response, did not yield any perceptual differences between specular and temporally diffuse reflections for most conditions (5 out of 6) with speech and music signals. Thus, their definition of the delay could serve as model to predict the echo threshold of conditions with arbitrary scattering coefficients.

Delays Δ T E ec $ \Delta {T}_{\mathrm{E}}^{{ec}}$ based on the centroid of energy of diffuse reflections of setup A are calculated using Equation (6) and we obtain k E ec $ {k}_{\mathrm{E}}^{{ec}}$ = 0.54. This value falls within the 95% confidence interval of the optimal parameter k E opt $ {k}_{\mathrm{E}}^{\mathrm{opt}}$. Figure 8 shows the temporally aligned data with k E ec $ {k}_E^{{ec}}$ = 0.54 and the corresponding regression line, which highly correlates with mean echo thresholds (R 2 = 0.99, α = −9.6 dB, β = −0.45 dB/ms with 95% confidence intervals α = [−10.7, −8.5] dB/ms, β = [−0.50, −0.41] dB/ms).

thumbnail Figure 8

Means and 95% confidence intervals of echo threshold levels ΔL E for different scattering coefficients s of setup A. The regression line is fitted with Equation (5) and delays Δ T E ec $ \Delta {T}_{\mathrm{E}}^{{ec}}$ are calculated from the temporal energy centroid of corresponding reflection responses using Equation (6) with k E ec $ {k}_{\mathrm{E}}^{{ec}}$ = 0.54.

6.2 Masked threshold

The temporal masking effect can be interpreted as a mixture of simultaneous masking and forward masking. Fastl and Zwicker [30] examined simultaneous masking of noise bursts by uniform masking noise, and found threshold levels that were not more than 10 dB below the masker level. In our experiments thresholds are sharply lower than the masker (i.e., the direct sound), and we assume that masked levels are mainly caused by forward masking. Following temporal masking theory a signal is masked as long as its excitation pattern is below the temporal masking pattern evoked by the masker. According to [30] (p. 83, Fig. 4.22) temporal masking patterns for forward masking of noise bursts do not exhibit an exponential decay, but thresholds can be well approximated by a logarithmic function of the delay time, i.e., ΔL M = γ + ε·ln(ΔT/ms) with ε < 0 for ΔT = 10, …, 100 ms.

The results of our experiments indicate that diffuse reflections linearly lower the masked threshold ΔL M by a factor δL in dB compared to specular reflections, cf. Figure 5. Thus, in order to model masked thresholds of arbitrary reflections, we extend the logarithmic function by the scattering coefficient yielding,

Δ L M = γ + s δ L + ε ln ( Δ T / m s ) . $$ \Delta {L}_{\mathrm{M}}=\gamma +s\cdot {\delta L}+\epsilon \cdot \mathrm{ln}(\Delta T/\mathrm{m}s). $$(7)

Figure 9 shows the forward masking function from [30] together with envelopes of specular and diffuse reflections. For a specular condition, the reflection amplitude exceeds the temporal masking pattern for the first time at the instant offset t = ΔT. For a diffuse condition on the other hand, the decay of the reflection amplitude is flatter than the logarithmic function given in Equation (7). Accordingly, peaks of the reflection envelope exceed the masking pattern at ΔT M at a lower level, cf. Figure 9. Fitting parameters for specular and diffuse masked levels of conditions A (10,20,30) are calculated using Equation (7) and are γ = 2.4 dB, ε = −12.6 dB, and δL = −7.8 dB (corresponding 95% confidence intervals: γ = [−6.7, 11.5] dB; ε = [−15.7, −9.5] dB; δL = [−14.0, −3.3] dB). This yields a coefficient of determination R 2 = 0.99 with mean masked thresholds.

thumbnail Figure 9

Schematic representation of the forward masking pattern (dashed) from [30] (p. 83, Fig. 4.22) as a function of the delay time log(ΔT) together with nominal amplitudes of specular and diffuse reflection (solid) as a function of time t. The masker (direct sound) ends at t = 0 ms. The masked level is achieved, if reflection’s envelope peaks exceed the post masking pattern. For the specular reflection this when the reflection ends at ΔT, whereas envelope fluctuations of the diffuse reflection exceed the level of masking in the time range of ΔT M.

Figure 10 shows masked threshold levels of specular and diffuse conditions of setup A with the corresponding logarithmic fit. For ΔT < 10 ms the forward masking function given in Figure 9 saturates. Accordingly, thresholds of conditions A 5 0 $ {A}_5^0$ and A 5 1 $ {A}_5^1$ are overestimated by the logarithmic model of Equation (7).

thumbnail Figure 10

Means and 95% confidence intervals of masked threshold levels for different scattering coefficients of setup A. The regression lines are fitted in the interval 10 ms ≤ ΔT ≤ 30 ms using Equation (7).

In contrast to our logarithmic model, in [21] a linear fit is used for masked threshold levels with delays 20 ms ≤ ΔT ≤ 80 ms. Similarly, for our results in this range a linear model would be sufficient. For short delays ΔT < 20 ms, however, Figure 10 clearly shows the advantage of the logarithmic approach.

7 Discussion

The results of the experiment indicate that diffusion makes reflections more easily perceivable as masked threshold levels are below the corresponding levels of specular reflections. Accordingly, diffuse reflections similarly weaken the precedence effect and less level is needed to perceive a diffuse echo compared to specular echoes. This finding agrees with the observations made in [14] but contradicts the findings from [13], where reflections form a Schroeder diffuser were reported to be more focused compared to specular reflections of the same energy. As their diffuser had a higher absorption coefficient in the frequency range 250 kHz–1 kHz, it is conceivable that it altered the spectrum of the scattered speech signal in such a way that it was less distractive.

By removing the spatial spread of our diffuse reflections, the suppression further decreases and obtained thresholds levels of temporally diffuse refections are below corresponding levels of fully diffuse conditions. Thus, we conclude that directional and temporal diffuseness have an opposite effect on the echo suppression. This finding dissolves the apparent contradiction of literature. In agreement to [12], directional diffusion increases the suppression. However, if reflections spread also temporally this effect vanishes, cf. [14].

The effect of spatial separation between direct sound and reflection, comprehensively investigated with specular reflections in [2, 27], and approved for temporally diffuse reflections in [11], applies also to fully diffuse reflections and the echo suppression is stronger when direct sound and reflection are close together.

In contrast to our results, in [11] no differences between specular and temporally diffuse reflections where obtained in most conditions. However, their definition of the delay between direct sound and reflection differs from ours as it is derived from the temporal centroid of energy of corresponding impulse responses. This finding is used to establish a linear model for the echo threshold that allows the prediction of arbitrary reflections by aligning their echo thresholds using temporal energy centroids of corresponding reflection responses.

8 Conclusions

Diffuse reflections were simulated based on Lambert’s cosine law. The simulated reflections yield a directional and temporal smearing of the reflected sound field, which is influenced by the geometrical setup of sound source, receiver, and reflective wall. The early decay of the temporal spread resembles an exponential function and agrees with other simulations found in the available literature. The echo threshold and the masked threshold were measured in a listening experiment to investigate the influence of a diffuse reflection on the precedence effect. The main findings of the study can be summarized:

  1. The comparison of both reflection types reveals the echo suppression to be weaker for diffuse reflections than for specular reflections of the same total energy. In other words, if the reflected sound is scattered, less level is necessary to hear it as an echo and less level to notice that a reflection is present at all.

  2. The weakening of the precedence effect is mainly due to the temporal diffusion. Spatial diffusion has a reverse effect and increases the suppression. However, as the latter hypothesis is based on the results from a single condition, it will need a more comprehensive validation.

  3. The spatial release from echo suppression was determined for diffuse reflections and echo threshold levels are lower if direct sound and diffuse reflection are directionally separated than when they arise from similar directions.

  4. Temporally aligning the echo threshold levels to energy centroids of corresponding reflection responses yields highly correlated curves and allows the modeling of the echo suppression of arbitrary scattering coefficients.

  5. Masked thresholds of specular reflections exhibit a logarithmic relation between the reflection’s delay and level which is linearly lowered if the refection is scattered. This relation can be explained by the temporal masking pattern for forward masking.

Acknowledgments

Our research was partly funded by the Austrian Science Fund (FWF) project nr. AR 328-G21, Orchestrating Space by Icosahedral Loudspeaker. We would like to thank all listeners for their participation in the experiments. Moreover we thank the anonymous, voluntary reviewers, whose critical comments contributed to substantial revisions of our manuscript.


1

http://audiogroup.web.th-koeln.de/ku100nfhrir.html

References

  1. A.D. Brown, G. Christopher Stecker, D.J. Tollin: The precedence effect in sound localization. Journal of the Association for Research in Otolaryngology 16, 1 (2015) 1–28. [CrossRef] [Google Scholar]
  2. R. Litovsky, B. Shinn-Cunningham: Investigation of the relationship among three common measures of precedence: Fusion, localization dominance, and discrimination suppression. The Journal of the Acoustical Society of America 109, 1 (2001) 346–358. [CrossRef] [PubMed] [Google Scholar]
  3. H. Kuttruff: Room Acoustics. Taylor & Francis, 5th ed., 2009. [Google Scholar]
  4. R. Guski: Auditory localization: Effects of reflecting surfaces. Perception 19, 6 (1990) 819–830. [CrossRef] [PubMed] [Google Scholar]
  5. B. Rakerd, W.M. Hartmann: Localization of sound in rooms, II: The effects of a single reflecting surface. The Journal of the Acoustical Society of America 78 (1985) 524–533. [CrossRef] [PubMed] [Google Scholar]
  6. B. Rakerd, W.M. Hartmann: Localization of sound in rooms, III: Onset and duration effects. The Journal of the Acoustical Society of America 80, 6 (1986) 1695–1706. [CrossRef] [PubMed] [Google Scholar]
  7. T.J. Cox, B.-I.I.L. Dalenback, P. D’Antonio, J.J. Embrechts, J.Y. Jeon, E. Mommertz, M. Vorländer: A tutorial on scattering and diffusion coefficients for room acoustic surfaces. Acta Acustica United with Acustica 92 (2006) 1–15. [Google Scholar]
  8. Jens Blauert, P.L. Divenyi: Spectral selectivity in binaural contralateral inhibition. Acta Acustica United with Acustica 66, 5 (1988) 267–274. [Google Scholar]
  9. D.R. Perrott, T. Strybel, C. Manligas: Conditions under which the Haas precedence effect may or may not occur. The Journal of Auditory Research 27 (1987) 59–72. [PubMed] [Google Scholar]
  10. A. Walther, P. Robinson, O. Santala: Effect of spectral overlap on the echo suppression threshold for single reflection conditions. The Journal of the Acoustical Society of America 134, 2 (2013) EL158–EL164. [CrossRef] [PubMed] [Google Scholar]
  11. P.W. Robinson, A. Walther, C. Faller, J. Braasch: Echo thresholds for reflections from acoustically diffusive architectural surfaces. The Journal of the Acoustical Society of America 134, 4 (2013) 2755. [CrossRef] [PubMed] [Google Scholar]
  12. J. Grosse, S. van de Par, C. Trahiotis: Stimulus coherence influences sound-field localization and fusion/segregation of leading and lagging sounds. The Journal of the Acoustical Society of America 141, 4 (2017) 2673–2680. [CrossRef] [PubMed] [Google Scholar]
  13. C. Visentin, M. Pellegatti, N. Prodi: Effect of a single lateral diffuse reflection on spatial percepts and speech intelligibility. The Journal of the Acoustical Society of America 148, 1 (2020) 122–140. [CrossRef] [PubMed] [Google Scholar]
  14. T. Lokki, J. Pätynen, S. Tervo, S. Siltanen, L. Savioja: Engaging concert hall acoustics is made up of temporal envelope preserving reflections. The Journal of the Acoustical Society of America 129, 6 (2011) EL223–EL228. [CrossRef] [PubMed] [Google Scholar]
  15. S. Siltanen, T. Lokki, S. Tervo, L. Savioja: Modeling incoherent reflections from rough room surfaces with image sources. The Journal of the Acoustical Society of America 131, 6 (2012) 4606–4614. [CrossRef] [PubMed] [Google Scholar]
  16. M.A. Biot: Reflection on a rough surface from an acoustic point source. The Journal of the Acoustical Society of America 29, 11 (1957) 1193–1200. [CrossRef] [Google Scholar]
  17. M.A. Biot: On the reflection of acoustic waves on a rough surface. The Journal of the Acoustical Society of America 30, 5 (1958) 479–480. [CrossRef] [Google Scholar]
  18. M.R. Schroeder: Binaural dissimilarity and optimum ceilings for concert halls: More lateral sound diffusion. The Journal of the Acoustical Society of America 65 (1979) 958–963. [CrossRef] [Google Scholar]
  19. T.J. Cox, P. D’Antonio: Acoustic Absorber and Diffusers. 2005. [Google Scholar]
  20. L. Dietsch, W. Kraak: Ein objektives Kriterium zur Erfassung von Echostörungen bei Musik- und Sprachdarbietungen. Acta Acustica United with Acustica 60, 3 (1986) 205–216. [Google Scholar]
  21. B. Rakerd, W.M. Hartmann, J. Hsu: Echo suppression in the horizontal and median saggittal planes. Journal of the Acoustical Society of America 107, 2 (2000) 1061–1064. [CrossRef] [Google Scholar]
  22. R.L. Freyman, R.K. Clifton, R.Y. Litovsky: Dynamic processes in the precedence effect. The Journal of the Acoustical Society of America 90, 2 (1991) 874–884. [CrossRef] [PubMed] [Google Scholar]
  23. H. Levitt: Transformed up-down methods in psychoacoustics. The Journal of the Acoustical Society of America 49, 2B (1971) 467–477. [CrossRef] [Google Scholar]
  24. S.A. Klein: Measuring, estimating, and understanding the psychometric function: A commentary. Perception & Psychophysics 63, 8 (2001) 1421–1455. [CrossRef] [PubMed] [Google Scholar]
  25. J. Daniel: Représentation de champs acoustiques, application à la transmission et à la reproduction de scènes sonores complexes dans un contexte multimédia. PhD thesis, Université Paris 6, 2001. [Google Scholar]
  26. A. Kohlrausch, R. Kortekaas, M. van der Heijden, S. van de Par, A.J. Oxenham, D. Püschel: Detection of tones in low-noise noise: Further evidence for the role of envelope fluctuations. Acta Acustica United with Acustica 83 (1997) 659–669. [Google Scholar]
  27. H.P. Sepharim: Über die Wahrnehmbarkeit mehrerer Rückwurfe von Sprachschall. Acustica 11, 2 (1961) 80–91. [Google Scholar]
  28. S.S. Sawilowsky: New effect size rules of thumb, The Journal of Modern Applied Statistical Methods 8, 2 (2009) 597–599. [CrossRef] [Google Scholar]
  29. B.G. Shinn-Cunningham, P.M. Zurek, N.I. Durlach: Adjustment and discrimination measurements of the precedence effect. The Journal of the Acoustical Society of America 93 (1993) 2923–2932. [CrossRef] [PubMed] [Google Scholar]
  30. H. Fastl, E. Zwicker: Psychoacoustics – Facts and Models. 3rd ed., Hugo Fastl, Eberhard Zwicker, 2006. [Google Scholar]

Cite this article as: Wendt F. & Höldrich R. 2021. Precedence effect for specular and diffuse reflections. Acta Acustica, 5, 1.

All Tables

Table 1

Setup parameters of the conditions under investigation.

All Figures

thumbnail Figure 1

Schematic geometry of a reflection on the xz-plane. The basic constellation consists of a source S, a receiver R, and a wall W.

In the text
thumbnail Figure 2

Directional spread of diffusely reflected intensities dI W normalized to the maximum and coded in gray-scale. The sound propagation path of a specular reflection is indicated as solid line.

In the text
thumbnail Figure 3

Intensity envelopes of the impulse response of condition A 20 consisting of direct sound an diffuse reflection. The temporal spreads for diffusely reflected intensities I diff for an infinite wall and the finite-sized wall W are normalized to the direct sound. The exponential model proposed in [15] is fitted to the early decay.

In the text
thumbnail Figure 4

Schematic representation of the constellation of the reflective wall for setups that were examined in the experiment. For lateral-wall setups A and B (black) sound paths of direct sound and specular reflection, and specular reflection angle θ 0 and wall angle θ W are indicated. For setup C (gray) both direct sound and reflection are perpendicular to the wall.

In the text
thumbnail Figure 5

Means and 95% confidence intervals of echo threshold levels (filled symbols) and masked threshold levels (open symbols) for specular reflections (s = 0) and diffuse reflections (s = 1) over delays ΔT tested with setup A.

In the text
thumbnail Figure 6

Results of the echo threshold level (filled symbols) and the masked threshold level (open symbols) given as means and 95% confidence intervals. The influence of the scattering coefficient s is examined with conditions A s , the influence of the angle θ 0 is examined with conditions A 1, B 1, C 1, and the influence of the directional spread is examined with the hybrid condition A , which combines the temporal characteristics of a diffuse reflection with the directional characteristics of a specular reflection. For all conditions the delay is constant with ΔT = 20 ms.

In the text
thumbnail Figure 7

Diffusely reflected intensity I diff arriving at the receiver R of conditions A 1, B 1, and C 1 with the finite sized wall W, normalized to their maximums and shifted in time by T d .

In the text
thumbnail Figure 8

Means and 95% confidence intervals of echo threshold levels ΔL E for different scattering coefficients s of setup A. The regression line is fitted with Equation (5) and delays Δ T E ec $ \Delta {T}_{\mathrm{E}}^{{ec}}$ are calculated from the temporal energy centroid of corresponding reflection responses using Equation (6) with k E ec $ {k}_{\mathrm{E}}^{{ec}}$ = 0.54.

In the text
thumbnail Figure 9

Schematic representation of the forward masking pattern (dashed) from [30] (p. 83, Fig. 4.22) as a function of the delay time log(ΔT) together with nominal amplitudes of specular and diffuse reflection (solid) as a function of time t. The masker (direct sound) ends at t = 0 ms. The masked level is achieved, if reflection’s envelope peaks exceed the post masking pattern. For the specular reflection this when the reflection ends at ΔT, whereas envelope fluctuations of the diffuse reflection exceed the level of masking in the time range of ΔT M.

In the text
thumbnail Figure 10

Means and 95% confidence intervals of masked threshold levels for different scattering coefficients of setup A. The regression lines are fitted in the interval 10 ms ≤ ΔT ≤ 30 ms using Equation (7).

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.