Open Access
Issue
Acta Acust.
Volume 8, 2024
Article Number 9
Number of page(s) 11
Section Virtual Acoustics
DOI https://doi.org/10.1051/aacus/2023070
Published online 16 February 2024

© The Author(s), Published by EDP Sciences, 2024

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

Today, it is well-known that noise has a negative impact on human health. Thus, the increasing noise pollution, e.g. caused by cars, aircraft or wind turbines, is a growing issue [1]. Typical noise control approaches are based on energy-based metrics like Lden. However, there are multiple studies showing that a reduced sound pressure level does not necessarily lead to a reduction of perceived loudness [2] or annoyance [3, 4].

More suitable approaches to account for human perception are listening experiments or using psychoacoustic parameters. In this context, the concept of auralization enables the generation of the required time-signals based on purely synthesized data using physics-based models [5]. In contrast to measurements, this allows the comparison of noise scenarios only varying in specific parameters, e.g. the source characteristics [4, 6] or weather parameters [7]. During the last decades, the real-time operation of auralization frameworks has been an important aspect of virtual acoustics research (see e.g. [8, 9]). One of the main goals is enabling users to interact with the virtual acoustic environment. The human-virtual interaction can be improved not only by having a mobile listener within the virtual scenario, but also by allowing the user to switch between different simulation conditions [10]. Hence, real-time auralization effectively broadens the diversity of experiments that can be carried out.

A central step of the auralization is the simulation of sound propagation between source and receiver. Based on the simulation result certain propagation parameters, e.g. propagation delay, spreading loss and air attenuation, are applied to the signal emitted by the source. The closer the simulation model is to reality, the higher the computational complexity. For example, considering the inhomogeneity of the atmosphere using curved sound propagation [10, 11] leads to a significant increase in computation time compared to typical straight-path implementations [12, 13]. This again can be a bottleneck for real-time processing, especially for dynamic scenarios where the simulations have to be carried out repeatedly. To avoid that, the simulations can be scheduled to a separate thread [10]. However, if the simulation update rate falls below a certain threshold, the plausibility of the auralization result might be reduced. In worst case, it can even introduce audible artifacts, e.g. sudden “jumps” in gain or Doppler factors.

In this paper, a method for real-time auralization using scheduled sound propagation simulations is introduced. As suggested in [10, 11], the simulation results are upsampled to the audio block rate using interpolation. The method is applied to an aircraft flyover auralization, which previously suffered from artifacts even when using scheduling [10]. The resulting framework allows an artifact-free real-time auralization with quality comparable to the respective offline approach. Generally, the method can be applied to other auralization approaches, which are based on an independent signal processing of sound paths.

2 Auralization based on geometric sound paths

When auralizing under real-time constraints, a typical approach is using blockwise audio processing. This means consecutive samples are summarized in audio blocks of a certain length and processed together, e.g. applying a fast Fourier transform (FFT). In this case, the overall processing time for rendering one audio block may not exceed the block’s duration, as this will lead to audio dropouts. Thus, typically the principle of geometrical acoustics is used for modeling the sound propagation, since respective algorithms are very computationally efficient. Such methods allow the simulation of geometric sound paths connecting a source with a receiver. Based on these, sound propagation parameters such as spreading loss or air attenuation can be derived controlling the digital signal processing elements used for the auralization. This is done for each sound path separately, as shown in Figure 11.

thumbnail Figure 1

Sound-path-based auralization approach.

Such an approach is especially efficient, if the number of considered sound paths is low. This is the case for scenarios related to environmental noise, such as wind turbine [14], car pass-by [15, 16], or aircraft noise [6, 10, 12, 13, 17]. Here, it is common to model the ground as flat surface resulting in two sound paths. Nevertheless, such an auralization approach was also applied to urban scenarios considering higher-order reflections and diffraction using a significantly higher number of sound paths [18].

The sound propagation parameters being typically considered are propagation delay, spreading loss, air attenuation, launch direction at the source, and incident direction at the receiver [6, 10, 1218]. Depending on the considered scenario, additional parameters can become relevant. Figure 2 shows an example of how a single sound path could be processed for an aircraft flyover scenario. In this case, the influence of turbulence-induced amplitude modulation and a ground reflection filter are also considered. With respect to the audio processing, the spreading loss is applied as linear gain factor. All frequency-dependent propagation parameters are considered in the form of one-third octave band magnitude spectra with one linear factor per band. These are accumulated and processed using a single filter bank. This is a typical simplification in order to reduce the computational complexity [10, 12, 13, 15, 18]. Nevertheless, the main focus in this paper is the propagation delay and its effect on the digital audio stream.

thumbnail Figure 2

Digital signal processing approach for a single sound path for aircraft flyover scenarios.

2.1 Propagation delay and Doppler shift

For the auralization of dynamic scenes, a crucial element in the signal processing chain is the variable delay-line (VDL) [13, 16, 19, 20]. In addition to delaying the input signal, this element allows modeling the Doppler effect using the propagation delay τ. For this purpose, the input signal is resampled based on propagation delay’s rate of change when reading the data. The respective resampling or Doppler factor is [19]

(1)

For block processing, this factor depends on the block size B, the sampling rate fS and the propagation delay difference between two consecutive audio blocks Δτ

(2)

In the course of this work, the propagation delay is rounded to the next audio sample before handed to the VDL. The resampling is done using a cubic spline interpolation between respective audio samples.

2.2 Integration of simulations into auralization chain

Generally, a static “snapshot” of the scene is assumed while carrying out a single sound propagation simulation. Thus, the simulation must be continuously repeated to consider changes in the scene like source and receiver movement. The sound paths and respective sound propagation parameters are updated accordingly. The update rate of the simulations depends on how the simulation is integrated into the overall processing chain as discussed in the following.

2.2.1 Sequential approach

A straight-forward approach is integrating the simulations sequentially in the overall auralization chain as shown in Figure 3a. This means all operations are carried out in the same thread: First, the current scene state including the source and receiver position is evaluated. Based on this, a simulation is carried out before the sound propagation parameters are applied to the raw source signal as shown above (Fig. 2). Nevertheless, this approach is limited as the update rate of the simulations must be significantly faster than the audio block rate to avoid audio drop outs. This is because the DSP operations already require a significant portion of the available processing time. Such an approach might work for simple simulation models, e.g. assuming straight paths, free-field conditions and limited number of sources. It fails when considering more complex scenarios, e.g. with a significantly higher number of sound paths or considering refraction.

thumbnail Figure 3

Integration of sound propagation simulations in the overall auralization process using a sequential (a) or scheduling (b) approach. Simulations are carried out based on the current scene state including source and receiver pose. The simulation results are fed to the DSP elements responsible for the audio rendering.

thumbnail Figure 4

Aircraft flyover scenarios used for the auralizations in this paper. Two 15 s trajectory segments with two receiver positions each leading to a total of four scenarios.

2.2.2 Scheduling

One solution to avoid audio dropouts in those cases is running the simulations in a separate processing thread. For this purpose, a scheduling system managing and filtering the simulation requests from the audio rendering process must be installed [21, 22]. For example, Wefers et al. used this approach in the context of real-time auralization of room acoustic scenarios [21]. A more generalized framework was introduced by Palenda et al. [22]. This one is not restricted to room acoustics and can be extended to work with additional simulation methods.

The approach of scheduling sound propagation simulations in an auralization context is sketched in Figure 3b. Here, a simulation request is sent to the scheduler living in a separate thread after the current scene state is evaluated. Then, the DSP operations are carried out directly without waiting for the simulation results. Once a new simulation result is ready, the respective DSP elements are updated.

This approach leads to an irregular simulation update rate. Furthermore, especially in the case of computationally expensive simulations, the simulation rate might be lower than the audio block rate. As a consequence, the DSP elements are not updated for every audio block. This means subsequent audio blocks are rendered based on the same sound propagation parameters until an update arrives. Then, those parameters can show significant discontinuities, which might lead to audible artifacts in the output signal. This is especially relevant when considering the Doppler effect as discussed in the following section.

3 Artifacts at low simulation update rates

In the context of this work, the problem discussed in Section 2.2.2 was observed while running an aircraft flyover auralization. The underlying sound propagation simulations [10] consider curved propagation in the atmosphere. The calculation of respective sound paths is done by numerically solving ordinary differential equations. The process of finding the paths connecting source and receiver is not deterministic and therefore requires calculating a significant number of paths. Thus, the simulations are computationally demanding and must be scheduled. As mentioned before, this leads to low, irregular simulation update rates causing artifacts in the output signal. This section discusses how the observed artifacts are traced back to the processing of the Doppler effect and why they are happening.

3.1 Aircraft flyover auralization

In this work, the aircraft flyover auralization is carried out using Virtual Acoustics (VA) [23]. This modular framework allows the auralization of various scenarios making respective assumptions regarding the sound propagation. For aircraft flyovers, the direct sound and a ground reflection are considered. The sound paths are processed according to Figure 2. The respective sound propagation is simulated using the Atmospheric Ray Tracing (ART) framework. This considers the inhomogeneity of the atmosphere as well as wind leading to curved sound paths. The run-time of a single simulation is in the order of 40 ms but depends, e.g. on the source-receiver distance [10]. The simulations are scheduled according to Figure 3b using the ITA Simulation Scheduler [22]. It allows specifying a maximum simulation rate to avoid flooding the scheduler with requests, which would impact its performance. In order to leave enough headspace for the simulations, this is chosen to 10 Hz. However, the average update rate is actually 8.6 Hz corresponding to a temporal distance of ≈116 ms between simulations. The auralization is performed using a block size of 1024 samples and a sampling rate of 44,100 Hz. This means that the simulation is carried out every 5 audio blocks, approximately.

The input of the auralization chain is driven by a synthesized aircraft source signal based on a fan [24] and jet noise model [25]. The synthesis is done offline and the signal is included using a wav-file to introduce minimum computational load. As simplification, an omnidirectional source directivity2 is applied. For the spatialization at the receiver, the signal of each sound path is convolved with a head-related transfer function (HRTF) having a length of 256 samples.

In total, four scenarios are auralized. Those scenarios are based on the test scenario in [10] and only differ with respect to the aircraft trajectory and receiver positions. The original trajectory was simulated using the MICADO software [26] representing a take-off procedure. Here, the trajectory is aligned so that [0, 0, 0] m corresponds to the position the aircraft lifts off the ground. Furthermore, only two 15 s segments of this trajectory are used. For each segment, two receiver poses are chosen so that the aircraft passes from left two right (see Fig. 4). All receivers are placed at an altitude of 1.8 m to emulate a human listener. While the first receiver resides directly below the trajectory, the second is moved 1000 m to the side, respectively. This allows the consideration of sound propagation perpendicular to the wind direction, which might affect the performance of the simulation. Furthermore, the receivers of the blue scenarios are placed significantly closer to the trajectory to consider the run-time dependency on the source-receiver distance.

3.2 Evaluation of sound propagation parameters

As discussed above, the auralization results include audible artifacts. In order to find their origin, the sound propagation parameters are logged during the auralization process. Then, the difference between consecutive simulation results, referred to as consecutive variation, is evaluated. A large variation could lead to a perceivable “jump” or artifact in the output signal. Thus, in the following, a threshold is defined for each sound propagation parameter, below which an artifact-free output stream is expected. This is compared to the respective maximum consecutive variation along all considered scenarios. The results are summarized in Table 1.

Table 1

Maximum consecutive variation of sound propagation parameters. The data represents the maximum values over all studied scenarios. Only the air attenuation exceeds the specified threshold but just at higher frequencies with significant damping (at least 64 dB). For the propagation delay no threshold is specified.

3.2.1 Spreading loss

As the spreading loss update is applied as gain factor, the just-noticeable difference (JND) for sound pressure of 1 dB [5, 27] is a reasonable threshold. Using a larger change of the signal gain might be perceived as a sudden amplitude change, which is not realistic. For the investigated scenarios, the maximum consecutive variation of the spreading loss is 0.3 dB and therefore, this parameter should not cause artifacts.

3.2.2 Air attenuation

For the air attenuation update, the same threshold as for the spreading loss (1 dB) is used. Due to its frequency-dependency, the evaluation is done for each one-third octave band, respectively. The resulting consecutive variation is found to exceed the specified threshold. However, such variations only occur at higher frequencies with significant damping (minimum attenuation of 64 dB). Thus, also the air attenuation is not considered to result in audible artifacts for the investigated scenarios.

3.2.3 Launch and incident direction

The launch and incident direction are relevant for applying the source directivity and spatialization at the receiver. Directivity data is typically available using equiangular, spherical grids. The same holds for head-related transfer functions (HRTFs), which can be used for the spatialization at the receiver. Thus, it makes sense to evaluate launch and incident direction with respect to their elevation Θ and azimuth ϕ. A low simulation update rate could cause a sudden “jump” between directions. Generally, it is desired that the respective directions smoothly change between consecutive simulation updates. In best case, this change does not exceed the resolution of the respective data set. As 1° × 1° grids can be considered as high resolution for both, directivity and HRTF data sets, the threshold is chosen to 1°. As shown in Table 1, this value is not exceeded in any of the considered scenarios.

3.2.4 Propagation delay

The results for the aforementioned propagation parameters suggest that the propagation delay and its processing in the variable delay-line is the origin of the artifacts. Since it is hard to formulate a threshold for this parameter above which artifacts are expected, evaluating the consecutive variation of this parameter cannot be used to prove this theory.

Instead, the auralizations are run a second time using a constant propagation delay. In this way, the Doppler effect is not applied to the input signal. As expected, the resulting audio files are artifact-free .

3.2.5 Conclusion

Thus, the perceived artifacts are indeed caused while applying the Doppler effect using the VDL. The reason for those artifacts is discussed in the following section.

3.3 VDL at low update rates

When applying the Doppler effect using a variable delay-line, the input signal is resampled according to equation (2). An example is shown in Figure 5. Here, a pure tone is used as input. It is assumed that the simulation update rate is at least as high as the audio block rate (e.g. using sequential simulations). Thus, propagation delay data is given for every audio block. A propagation delay change Δτ ≠ 0 leads to a frequency-shift at the output signal: A decreasing delay leads to an increase in pitch and vice versa. Furthermore, the pitch-shift smoothly scales with the respective amplitude |Δτ|. A respective audio example is .

thumbnail Figure 5

Illustration of how a variable delay-line (VDL) applies the Doppler effect if propagation delay data is present for every audio block. Note, that the utilized data does not represent an actual scenario and is exaggerated. (a) VDL input signal; (b) Propagation delay; (c) VDL output signal.

In the case of scheduled simulations with low update rates, the propagation delay is only updated every n-th audio block. As illustrated in Figure 6, this has two effects:

  1. For audio blocks without simulation update (e.g. block #3, #4), Δτ is zero meaning no pitch shift is applied.

  2. Once a simulation update arrives, Δτ refers to the accumulated change of all blocks since the last update. Nevertheless, it is still referenced to the length of a single audio block. Thus, the pitch shift is significantly over-estimated (e.g. blocks #2, #7, #9).

thumbnail Figure 6

Illustration of how a variable delay-line (VDL) applies the Doppler effect if propagation delay data is not present for every audio block. Note, that the utilized data does not represent an actual scenario and is exaggerated. Updates of propagation delay are highlighted using red squares. (a) VDL input signal; (b) Propagation delay; (c) VDL output signal.

As a consequence, the signal “jumps” between the original, unchanged signal and a version with exaggerated change in pitch (). To overcome this issue, the missing propagation delay data for the audio blocks without simulation result can be estimated, e.g. using interpolation.

4 Interpolating results of scheduled simulations

This paper proposes a method allowing to “fill” the missing sound propagation parameter data, e.g. propagation delay, for audio blocks without a simulation result. For this purpose, the simulation results are buffered and interpolated instead of directly being handed to the DSP modules.

A problem with interpolating this data in a real-time context is that future samples are not available. The newest data available is from a simulation issued in the past. However, the simulation issue time refers to the time the sound is emitted by the source. In a real scenario, all propagation effects are delayed as the sound requires time to propagate to the receiver. A good example is aircraft noise. Typically, one acoustically locates an aircraft at a significantly different position than visually because the sound emitted by the aircraft takes multiple seconds to reach a listener on the ground, while light propagates the same distance almost instantaneously. Thus, instantly applying the sound propagation parameters resulting from a simulation only makes sense under the assumption that the propagation delay is small compared to the sound propagation parameters’ rate of change. A more accurate approach is to buffer and delay the simulation results based on the respective propagation delay as shown in Figure 7. This also ensures that “future” data is available for the interpolation.

thumbnail Figure 7

Extension of the auralization scheme with scheduled simulated presented in Figure 4. Instead of directly being fed to the DSP elements, the simulation results are sent to a buffer and being delayed using the respective propagation delay. For every audio block, the data is interpolated before being applied to the DSP elements.

4.1 Underlying assumptions

In order for the presented approach to provide meaningful results, the respective system must fulfill the following assumptions. First, it is assumed that the receiver translation is negligible while the sound propagates. If the receiver position is significantly changed, the simulated sound propagation parameters will mismatch to the new position. However, this assumption must generally made when performing sound propagation simulations, as the simulation input data – including source and receiver position – are assumed to be constant during the simulation.

Secondly, the simulation run-time must be significantly smaller than the propagation delay. The respective offset should also account for the processing time of buffering and interpolating the data. Only then, it is ensured that “future” samples are available for the interpolation.

4.2 Delaying and interpolating propagation parameters

In order to delay the simulation results, a special buffering system, a so-called data history buffer, is employed. As shown in Figure 8, the simulation results are stored together with a time key referring to the time the simulation was issued plus the propagation delay corresponding to this result. The data is stored in ascending order with respect to this key, i.e. a new result is appended at the end of the buffer. As the simulations are issued in irregular time intervals, also the buffered results have an irregular spacing.

thumbnail Figure 8

Diagram of a data history buffer used to delay and interpolate simulation results. The simulation i is issued at source time tS,i. The resulting propagation parameter is stored within the buffer using the receiver time tR,i, i.e. it is delayed by the respective propagation delay τi. On the other hand, the auralization thread requests a parameter at tnow referring to the current audio block (real-time), which is determined using the specified interpolation method.

As discussed above, the number and type of sound propagation parameters belonging to one simulation result depend on the considered simulation method. Thus, one data history buffer is used for each sound propagation parameter. This modular approach makes it easier to adapt the system if another set of parameters shall be considered. Furthermore, it allows using different interpolation methods for the respective parameters.

Now, for every audio block, the DSP elements request data from the history buffer corresponding to their propagation parameter. The data is estimated by interpolation using the time of the current audio block. In this context, it must be ensured that a simulation result is not written into the history buffer while the DSP elements try to read the data. This can be achieved by intermediately storing the results in a thread-safe concurrent queue. This data is transferred to the actual buffer just before executing the interpolation.

4.3 Considered interpolation methods

In the course of this work, the following interpolation methods are considered: sample-and-hold, nearest-neighbor, linear and natural cubic spline.

Sample-and-hold and nearest-neighbor are a special group of interpolation methods, as they simply copy the data from an existing sample instead of creating a smooth estimate between consecutive samples. While this is extremely efficient, it does not work for the case of low simulation update rates leading to artifacts (see Sect. 3). In fact, sample-and-hold is the default behavior of previous approaches in order to avoid dropouts (see Fig. 3). The only difference is that the simulation results are delayed. However, such interpolation methods can be sufficient, if at given simulation update rates the behavior of sound propagation parameter values does not lead to audible artifacts.

In contrast, the linear and cubic spline interpolation methods estimate “new” data when evaluating between given samples. Thus, they allow a continuous change of the data for all audio blocks between consecutive simulation results (in contrast to Fig. 6). The main differences are the number of considered samples, the smoothness of the output and the computational complexity. The linear interpolation only considers two samples including one future sample. It provides a -continuity meaning the output data is continuous while its derivative is non-continuous but constant within one interval. When processing the Doppler effect, this leads to a constant Doppler factor between two simulation results. This could be audible, if the simulation update rate is too low. On the other hand, the cubic spline method provides -continuity. Therefore, its output is sufficiently smooth for small curvatures [28] allowing a smooth transition between Doppler factors. This comes at the cost of increased computational complexity. Furthermore, this method requires four input samples of which two are future samples. Consequently, it requires to run at least two simulations within the time the sound propagates from source to receiver (see Sect. 4.1).

Generally, the presented approach is not restricted to given interpolation methods. Which method is most suitable depends on the respective sound propagation parameter and the simulation update rate compared to the audio block rate. Furthermore, it is relevant what DSP modules are used to process the data. For example, a variable delay-line requires new propagation delay data for every audio block, while updating the signal gain based on the spreading loss every few audio blocks might be sufficient. An investigation as done in Section 3.2 can help predicting the requirements for the utilized interpolation methods. Ultimately, those methods must be tested in a real-time context. In the following, such an investigation is done for the aircraft flyover auralization discussed in Section 3.1.

5 Performance in real-time aircraft flyover auralization

In this section, the performance of the presented method is evaluated for the aircraft flyover auralization setup discussed in Section 3.1. For this purpose, the respective auralization approach is adjusted according to Figure 7. All sound propagation parameters, are delayed using the introduced data history buffers. As the investigation in Section 3.2 revealed, the propagation delay is the only parameter expected to cause artifacts. Thus, only Scenario 4 is considered here, which showed the highest variation of the propagation delay. Furthermore, all other parameters are interpolated using the sample-and-hold approach. The auralizations are carried out multiple times to test different interpolation methods for the propagation delay: sample-and-hold, nearest-neighbor, linear and cubic spline. The characteristics of the laptop computer used to run the auralizations is summarized in Table 2. Additionally, an offline auralization is carried out serving as baseline.

Table 2

Summary of the computer used to run the auralizations described in this paper.

Since the origin of the artifacts was traced to processing the Doppler shift, a 1 kHz sine wave is used as input signal. Applying the Doppler to a pure tone signal can be considered as worst case, since potential artifacts are not masked by the content of other frequencies, which would be the case for an aircraft signal containing a significant broadband noise component. In order to properly evaluate the results, all frequency-shaping processing steps are neutralized. This means the processing steps are still carried out without changing the computational load, but the respective effects are not applied to the input signal. Regarding the sound propagation, only spreading loss, propagation delay and Doppler shift are considered in order to avoid influences from other effects such as ground reflection or HRTF filtering. As only frequency-independent effects are considered, evaluating only for a single base-frequency is reasonable.

For the given auralization system, both assumptions discussed in Section 4.1 are valid: Due to the large source-receiver distance, the propagation delay is magnitudes higher than the time corresponding to the simulation update rate (multiple seconds vs. ≈116 ms). For the same reason, the receiver translation is negligible compared to the fast traveling aircraft.

The goal of this investigation is to find the most suitable interpolation method for the propagation delay: this method should allow producing an artifact-free output stream while introducing the lowest computational load possible. Furthermore, the plausibility of the real-time approach should be similar to the offline auralization. For this purpose, the output stream of each auralization is recorded. The results are checked for artifacts in a subjective assessment by the authors. Furthermore, an objective analysis based on the psychoacoustic parameter specific loudness [29] is done. In addition, the run-time of the data history buffer using different interpolation methods is compared. Note, that the results regarding the nearest-neighbor approach are not discussed in detail as they are very similar to the sample-and-hold results.

5.1 Interpolation method run-time

A run-time comparison of the data history buffer using different interpolation methods is carried out. Each measurement is repeated 1 million times and the results are averaged. In order to evaluate the worst-case scenario, the measured run-time includes the interpolation and the process of copying a new simulation result from the concurrent queue to the actual buffer. As shown in Table 3, the run-time increases with complexity of the interpolation method. While the linear interpolation does not introduce a significant computational overhead compared to the sample-and-hold method, the cubic spline approach is significantly slower (factor >10).

Table 3

Run-time of the data history buffers for interpolating the propagation delay comparing different methods. Additionally, the data is set in relation to the available time to process an audio block (1024 samples at 44.1 kHz).

Nevertheless, comparing the run-time to the available time to process one audio block, the run-time seems to be insignificant. Thus, for given setup, the computational performance of utilized interpolation methods is secondary compared to the resulting audio quality. Nevertheless, the run-time could be relevant for more demanding auralization setups, e.g. with shorter block sizes or additional sources.

5.2 Output audio quality

A subjective assessment of the resulting auralization files with respect to audio quality was done by the authors. The main focus is on evaluating whether the artifacts discussed in Section 3 are still present when interpolating the propagation delay.

As discussed in Section 4.3, using the sample-and-hold or nearest-neighbor approach are not able to remove those artifacts. This is expected, since those interpolation methods generate discontinuous output, leading to the problem with the variable delay-line discussed in Section 3.3. This means that for some audio blocks, no Doppler shift is applied while for others, the effect is exaggerated. This is perceived as jitter of the sine signal’s frequency.

On the other hand, linear and cubic spline interpolation lead to an artifact-free auralization. To the authors, no difference in auralization quality between those methods is found. Comparing those results to the offline auralization, some minor pitch difference can be perceived, e.g. only playing the first second directly after one-another. Here, interpolating the propagation delay leads to a low-pass filter effect, which affects the Doppler factor and therefore the pitch. However, if listening to the full tracks, this pitch difference is not noticeable.

5.3 Psychoacoustic evaluation

To backup those findings, an additional psychoacoustic analysis is done. For this purpose, the specific loudness based on the Zwicker method [29] is calculated for each auralization using 0.2 s time segments. The results are shown in Figure 9.

thumbnail Figure 9

Specific loudness of the auralization result for Scenario 4 using a 1 kHz sinusoidal signal as input. Only the direct sound and the sound propagation effects spreading loss and Doppler effect are considered during the auralization. Furthermore, no spatialization is applied (e.g. convolution with HRTFs). For the real-time results, different interpolation methods are applied to the propagation delay. Note, that the higher bark bands (18-24) are not shown, since they do not contain significant energy. (a) Offline auralization; (b) Real-time/linear interpolation; (c) Real-time/cubic spline interpolation; (d) Real-time/sample-and-hold interpolation.

As can be seen in the data for the offline auralization, the Doppler shift leads to a decreasing pitch over time. This can be well reproduced in the real-time auralization using linear or cubic spline interpolation for the propagation delay. Visually, the plots are almost indistinguishable from the offline auralization. However, looking closely to the section at around 9 s, it can be seen that the transition to the lower bark band is smoother for the real-time signals. This is again caused by the low-pass effect of the propagation delay interpolation. As discussed in Section 5.2, this was not noticed when listening to the signals. Hence, this is not believed to reduce the plausibility of the results. On the other hand, comparing the data based on linear and cubic spline interpolation, no significant difference is observed.

For the sample-and-hold approach, the “jittering” of the output signal’s frequency is clearly visible: the respective data features rapid back-and-forth modulation between the source signal frequency (1 kHz) and a significantly lower frequency with exaggerated Doppler shift.

To better quantify the differences, the (frequency-averaged) loudness is presented in Figure 10. The results show a very good match between the offline reference and the approaches with linear or cubic spline interpolation. Here, the maximum relative deviation is 5.4% and 5.9%, respectively. To put this in context, a relative difference in loudness of 7% is equivalent to the commonly used just-noticeable level difference of 1 dB and therefore, should not be noticeable. On the other hand, a significant deviation of up to 55% can be observed for the sample-and-hold approach. This is caused by the energy being distributed into different bark bands which, in accordance with human perception, have different weightings in the loudness model.

thumbnail Figure 10

(Frequency-averaged) loudness corresponding to the results in Figure 9.

5.4 Conclusion

The investigation showed that interpolating the results from scheduled sound propagation simulations for real-time auralization allows eliminating artifacts caused by low simulation update rates. Within the tested aircraft auralization setup, a sample-and-hold interpolation is sufficient for most sound propagation parameters. The only exception is the propagation delay. For this parameter, linear and cubic spline interpolation lead to plausible results comparable to an offline auralization. Since the linear interpolation introduces a lower computational load and requires only one “future” simulation result, it is the recommended method. However, if the simulation update rate was further reduced, a linear interpolation might not be sufficient anymore as it leads to a constant Doppler factor between consecutive simulation results. This could be relevant if e.g. adding additional sources to the auralization. Similarly, the sample-and-hold interpolation might not be sufficient anymore for the other sound propagation parameters, such as spreading loss. This is when the consecutive differences discussed in Section 3.2 become too large.

6 Summary

A method was proposed allowing to integrate computationally complex sound propagation simulations into real-time auralizations of dynamic scenarios. It is designed for auralization approaches processing the sound propagation parameters, such as propagation delay and spreading loss, for each sound path separately. If the simulation run-time exceeds the time available to render a single audio block, sequentially processing the simulations lead to audio drop outs. While this can be prevented by scheduling the simulations into a separate thread, this typically leads to the simulation update rate being significantly lower than the audio block rate. The proposed method allows to interpolate the simulation results in order to upsample them. This is done using a so-called data history buffers, which delay the simulation results based on the respective propagation delay. As a side-effect, this solves the problem of simulations being issued at the time the source emits the sound but being perceived by the receiver after the sound propagation.

The method was applied to an aircraft auralization approach considering curved sound propagation in an inhomogeneous, moving atmosphere. Here, most propagation parameters are interpolated using sample-and-hold approach, while the propagation delay is interpolated using the linear and cubic spline method. It is concluded that for the tested system, the real-time auralization renders audio signals with similar plausibility as the offline approach.

Generally, the introduced method could be applied to other approaches besides the discussed aircraft auralization, e.g. urban scenarios considering interaction with buildings and therefore dealing with higher numbers of sound paths. For this purpose, future work could address a more generalized study to develop decision aids for the interpolation strategies of considered propagation parameters. Which interpolation method is required should depend on the respective consecutive differences (see Sect. 3.2). The study in this paper can be used as guideline and provides theoretical thresholds for most propagation parameters. Regarding the propagation delay and Doppler effect, criteria for the perceptibility of frequency shifts could be applied to respective Doppler factors. Similar to the approach in this work, a comparison of real-time to offline auralization data is recommended. The major challenge for such a generalized study is that the consecutive difference have a series of dependencies: The simulation update rate, relative movement between source and receiver, utilized signal processing elements and most importantly, the underlying simulation method.

Furthermore, it should be noted that this method is not restricted to real-time auralization. The introduced data history buffers can also be applied in an offline auralization where the simulations are run sequentially. This allows properly delaying the sound propagation parameters to match the receiver time.

Typical environmental noise scenarios, consist of a series of features and sound sources. Here, it is reasonable to summarize many of those features as “background noise”, e.g. using ambisonics-encoded signals. Nevertheless, the most prominent sound sources should be rendered individually to ensure maximum quality. Thus, future work could investigate how the presented method behaves in more complex scenarios with more than a single source.

Conflict of interest

The authors declare that they have no conflicts of interest in relation to this article.

Data availability statement

In the course of this work, the presented method for interpolating scheduled simulation results was implemented as part of the auralization framework Virtual Acoustics (VA) [23] and is available since version v2023b at https://www.virtualacoustics.org/VA/download/. The respective code is open-source and available as git repository [30].

Supplementary material

The supplementary material of this paper includes the logged sound propagation parameters used for the evaluation in Section 3.2. Furthermore, a series of sound files supporting the discussions in Sections 3.2, 3.3 and 5.2 is provided. The signals for Section 5.2, are also the basis for the plots in Figures 9 and 10. Additionally, exemplary real-time auralizations of aircraft noise based on the presented method for interpolating simulation results are provided (linear and cubic spline ). Access here


1

This is not to be confused with FIR-based auralization, where the sound propagation effects of all sound paths are accumulated within a single, long finite impulse response (FIR) and applied using a single convolution.

2

When applying an actual source directivity, this requires the source synthesis model to provide the respective data for the full-sphere. The angle-dependent directivity is selected based on the launch direction and the source orientation. The main assumption is that, within the source-related coordinate system, the full-sphere directivity is constant. For time-variant directivities, the full-sphere directivity must be determined for each issued propagation simulation and stored in a data history buffer.

References

  1. World Health Organization (WHO): Environmental noise guidelines for the European, 2018. Available at https://iris.who.int/handle/10665/279952. [Google Scholar]
  2. H. Fastl, J. Hunecke Psychoacoustic experiments on the aircraft malus. In: Arnold W, Hirsekorn S, Eds., 21st Annual German Congress on Acoustics, Saarbrücken, Germany. Deutsche Gesellschaft für Akustik e.V., Oldenburg, Germany, 1995, pp. 407–410. https://pub.dega-akustik.de/DAGA_1991-1995.zip [Google Scholar]
  3. G. Brambilla, L. Maffei: Perspective of the soundscape approach as a tool for urban space design. Noise Control Engineering Journal 58, 5 (2010) 532. [CrossRef] [Google Scholar]
  4. S.A. Rizzi, A. Christian: A psychoacoustic evaluation of noise signatures from advanced civil transport aircraft. In: 22nd AIAA/CEAS Aeroacoustics Conference, Lyon, France, 30 May–1 June, 2016. American Institute of Aeronautics and Astronautics. [Google Scholar]
  5. M. Vorländer: Auralization: fundamentals of acoustics, modelling, simulation, algorithms and acoustic virtual reality, 2nd edn., RWTHedition, Springer International Publishing, Cham, 2020. [Google Scholar]
  6. R. Pieren, L. Bertsch, D. Lauper, B. Schäffer: Improving future low-noise aircraft technologies using experimental perception-based evaluation of synthetic flyovers. Science of the Total Environment 692 (2019) 68–81. [CrossRef] [Google Scholar]
  7. C. Dreier, M. Vorländer: Aircraft noise – auralization-based assessment of weather-dependent effects on loudness and sharpness. Journal of the Acoustical Society of America 149, 5 (2021) 3565–3575. [CrossRef] [PubMed] [Google Scholar]
  8. L. Savioja, J. Huopaniemi, T. Lokki, R. Väänänen: Creating interactive virtual acoustic environments. Journal of the Audio Engineering Society 47, 9 (1999) 675–705. [Google Scholar]
  9. N. Tsingos, E. Gallo, G. Drettakis: Perceptual audio rendering of complex virtual environments. ACM Transactions on Graphics 23, 3 (2004) 249–258. [CrossRef] [Google Scholar]
  10. P. Schäfer, M. Vorländer: Atmospheric ray tracing: an efficient, open-source framework for finding eigenrays in a stratified, moving medium. Acta Acustica 5 (2021) 26. [CrossRef] [EDP Sciences] [Google Scholar]
  11. Michael Arntzen: Aircraft noise calculation and synthesis in a non-standard atmosphere. PhD thesis, Delft University of Technology, 2014. Available at http://resolver.tudelft.nl/uuid:c56e213c-82db-423d-a5bd-503554653413. [Google Scholar]
  12. S. Rizzi, B. Sullivan: Synthesis of virtual environments for aircraft community noise impact studies. In: 11th AIAA/CEAS Aeroacoustics Conference, Monterey, California, 23–25 May 2005. American Institute of Aeronautics and Astronautics. [Google Scholar]
  13. A. Sahai, F. Wefers, S. Pick, E. Stumpf, M. Vorländer, T. Kuhlen: Interactive simulation of aircraft noise in aural and visual virtual environments. Applied Acoustics 101 (2016) 24–38. [CrossRef] [Google Scholar]
  14. K. Heutschi, R. Pieren, M. Müller, M. Manyoky, U.W. Hayek, K. Eggenschwiler: Auralization of wind turbine noise: propagation filtering and vegetation noise synthesis. Acta Acustica united with Acustica 100, 1 (2014) 13–24. [CrossRef] [Google Scholar]
  15. J. Forssén, T. Kaczmarek, J. Alvarsson, P. Lundén, M.E. Nilsson: Auralization of traffic noise within the LISTEN project – Preliminary results for passenger car pass-by. In: Proceedings of the European Conference on Noise and Control, Euronoise, Edinburgh, Scotland, 2009, Edinburgh, 2009. [Google Scholar]
  16. R. Pieren, T. Bütler, K. Heutschi: Auralization of accelerating passenger cars using spectral modeling synthesis. Applied Sciences 6, 1 (2015) 5. [CrossRef] [Google Scholar]
  17. M. Arntzen, S.A. Rizzi, H.G. Visser, D.G. Simons: Framework for simulating aircraft flyover noise through nonstandard atmospheres. Journal of Aircraft 51, 3 (2014) 956–966. [CrossRef] [Google Scholar]
  18. J. Stienen: Real-time auralisation of outdoor sound propagation. Logos Verlag, Berlin, 2023. Available at https://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=5629&lng=eng&id=. [CrossRef] [Google Scholar]
  19. J.O. Smith, S. Serafin, J. Abel, D. Berners: Doppler simulation and the Leslie. In: Proceedings of the 5th International Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany, September 26–28, 2002. [Google Scholar]
  20. J. Stienen, M. Vorlãnder: Real-time auralization of propagation paths with reflection, diffraction and the Doppler shift. In: B Seeber, Ed., 44th Annual German Congress on Acoustics, Munich, Germany, Deutsche Gesellschaft für Akustik e.V. (DEGA), Berlin, Germany, 2018, pp. 1302–1305. Available at https://pub.dega-akustik.de/DAGA_2018. [Google Scholar]
  21. F. Wefers, J. Stienen, S. Pelzer, M. Vorländer: Interactive acoustic virtual environments using distributed room acoustic simulations. In: S Weinzierl, M Vorländer, F Zotter, H-J Maempel, A Lindau, Eds., EAA Joint Symposium on Auralization and Ambisonics, Berlin, Germany, Universitätsverlag der TU Berlin, 2014, pp. 48–55. [Google Scholar]
  22. P. Palenda, P. Schafer, J. Stienen, M. Vorlander: Open-source simulation scheduling framework for real-time auralization. In: P Leistner, Ed., 48th Annual German Congress on Acoustics, Stuttgart, Germany, Deutsche Gesellschaft für Akustik e.V. (DEGA), Berlin, Germany, 2022, pp. 1451–1454. Available at https://pub.dega-akustik.de/DAGA_2022. [Google Scholar]
  23. Institute for Hearing Technology and Acoustics: Virtual Acoustics – a real-time auralization framework for scientific research, 2021. Available at http://virtualacoustics.org/VA. [Google Scholar]
  24. C. Dreier, M. Vorländer: Sound source modelling by nonnegative matrix factorization for virtual reality applications. INTER-NOISE and NOISE-CON Congress and Conference Proceedings 263, 5 (2021) 1053–1061. [CrossRef] [Google Scholar]
  25. C. Dreier, X. Vogt, W. Schröder, M. Vorländer: Acoustic source characterization of simulated subsonic jet noise using spherical harmonics. Journal of the Acoustical Society of America 154, 1 (2023) 167–178. [CrossRef] [PubMed] [Google Scholar]
  26. K. Risse, E. Anton, T. Lammering, K. Franz, R. Hoernschemeyer: An integrated environment for preliminary aircraft design and optimization. In 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics and Materials Conference 20th AIAA/ASME/AHS Adaptive Structures Conference 14th AIAA, Honolulu, Hawaii, 23–26 April 2012. American Institute of Aeronautics and Astronautics, 2012. [Google Scholar]
  27. H. Fastl, Eberhard Zwicker: Psychoacoustics: Facts and models. Number 22 in Springer Series in Information Sciences, 3rd edn. Springer, Berlin, New York, 2007. [Google Scholar]
  28. A. Quarteroni, R. Sacco, F. Saleri: Numerical mathematics. Number 37 in Texts in Applied Mathematics, Springer, New York, 2000. [Google Scholar]
  29. International Organization for Standardization: ISO 532–1:2017, Acoustics – Methods for calculating loudness – Part 1: Zwicker method, 2017. [Google Scholar]
  30. Institute of Technical Acoustics (ITA): VACore. Available at https://git.rwth-aachen.de/ita/VACore, December 2023. [Google Scholar]

Cite this article as: Schäfer P. Fatela J. & Vorländer M. 2024. Interpolation of scheduled simulation results for real-time auralization of moving sources. Acta Acustica, 8, 9.

All Tables

Table 1

Maximum consecutive variation of sound propagation parameters. The data represents the maximum values over all studied scenarios. Only the air attenuation exceeds the specified threshold but just at higher frequencies with significant damping (at least 64 dB). For the propagation delay no threshold is specified.

Table 2

Summary of the computer used to run the auralizations described in this paper.

Table 3

Run-time of the data history buffers for interpolating the propagation delay comparing different methods. Additionally, the data is set in relation to the available time to process an audio block (1024 samples at 44.1 kHz).

All Figures

thumbnail Figure 1

Sound-path-based auralization approach.

In the text
thumbnail Figure 2

Digital signal processing approach for a single sound path for aircraft flyover scenarios.

In the text
thumbnail Figure 3

Integration of sound propagation simulations in the overall auralization process using a sequential (a) or scheduling (b) approach. Simulations are carried out based on the current scene state including source and receiver pose. The simulation results are fed to the DSP elements responsible for the audio rendering.

In the text
thumbnail Figure 4

Aircraft flyover scenarios used for the auralizations in this paper. Two 15 s trajectory segments with two receiver positions each leading to a total of four scenarios.

In the text
thumbnail Figure 5

Illustration of how a variable delay-line (VDL) applies the Doppler effect if propagation delay data is present for every audio block. Note, that the utilized data does not represent an actual scenario and is exaggerated. (a) VDL input signal; (b) Propagation delay; (c) VDL output signal.

In the text
thumbnail Figure 6

Illustration of how a variable delay-line (VDL) applies the Doppler effect if propagation delay data is not present for every audio block. Note, that the utilized data does not represent an actual scenario and is exaggerated. Updates of propagation delay are highlighted using red squares. (a) VDL input signal; (b) Propagation delay; (c) VDL output signal.

In the text
thumbnail Figure 7

Extension of the auralization scheme with scheduled simulated presented in Figure 4. Instead of directly being fed to the DSP elements, the simulation results are sent to a buffer and being delayed using the respective propagation delay. For every audio block, the data is interpolated before being applied to the DSP elements.

In the text
thumbnail Figure 8

Diagram of a data history buffer used to delay and interpolate simulation results. The simulation i is issued at source time tS,i. The resulting propagation parameter is stored within the buffer using the receiver time tR,i, i.e. it is delayed by the respective propagation delay τi. On the other hand, the auralization thread requests a parameter at tnow referring to the current audio block (real-time), which is determined using the specified interpolation method.

In the text
thumbnail Figure 9

Specific loudness of the auralization result for Scenario 4 using a 1 kHz sinusoidal signal as input. Only the direct sound and the sound propagation effects spreading loss and Doppler effect are considered during the auralization. Furthermore, no spatialization is applied (e.g. convolution with HRTFs). For the real-time results, different interpolation methods are applied to the propagation delay. Note, that the higher bark bands (18-24) are not shown, since they do not contain significant energy. (a) Offline auralization; (b) Real-time/linear interpolation; (c) Real-time/cubic spline interpolation; (d) Real-time/sample-and-hold interpolation.

In the text
thumbnail Figure 10

(Frequency-averaged) loudness corresponding to the results in Figure 9.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.