Open Access
Issue
Acta Acust.
Volume 7, 2023
Article Number 9
Number of page(s) 7
Section Virtual Acoustics
DOI https://doi.org/10.1051/aacus/2023002
Published online 20 February 2023

© The Author(s), Published by EDP Sciences, 2023

Licence Creative CommonsThis is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

Almost 25% of deadly accidents involving vehicles in the workplace occur when the vehicle is reversing [1]. Moreover, from accident reports published by the Occupational Safety and Health Administration (OSHA) from 1972 to 2001, Purswell and Purswell [2] estimated that approximately 43% of the 150 reported accidents that involved vehicles occurred despite the reversing alarm being functional at the time. Consequently, the detectability of reversing alarms is an important safety issue and better understanding of alarm detectability is required. Two types of alarms are mainly used for reversing vehicles. A first one, noted tonal in the paper, uses short pure-tone bursts (approximately 1000 Hz, 0.5-s signal followed by 0.5 s of silence) [3, 4] while the second one, noted broadband in the paper, uses band limited noise bursts (approximately from 1500 to 5000 Hz, 0.5 s signal followed by 0.5 s of silence) [4, 5]. Compared to tonal alarms, broadband noise alarms seem to be easier to localize because of the more homogeneous sound propagation pattern with interference effects strongly attenuated by the fact that these alarms generate an audible signal over a frequency range in the most sensitive hearing range, between 2000 and 4000 Hz. Consequently, broadband noise alarms are recommended as they are easier for operatives to locate which improves site safety [6].

For safety and ethical reasons related to reversing alarms, the evaluation of the detectability can be conducted in a laboratory environment using recording and playback devices. In that field, one of the most widely-used techniques consists in recording sounds using a dummy-head and reproducing them through headphones. While more sophisticated methods, such as Higher Order Ambisonics (HOA) recordings with binaural reproduction using personalized Head-Related Transfer Functions (HRTF), have been developed, the use of the “dummy-head” technique remains prevalent in industrial companies. The advantages of this procedure are: (1) the ease of implementation and (2) the accurate reproduction of the sound signal at the entrance of the listener’s ear canal when the listener’s head is static. However, this procedure faces known limitations: The Head-Related Transfer Functions (HRTF) of the dummy head can be slightly different from the ones of the listeners, hence changing spatial sound localization cues [7]. Furthermore, in a real listening situation, Interaural Level Difference (ILD), Interaural Time Delay (ITD), spectral cues and binaural cues change with the tiniest head rotation. In contrast, for artificial hearing using binaural reproduction without head tracking, when the listener rotates his head, the auditory scene presented over headphones also rotates, which is contradictory to what can be experienced when the head is rotating in natural hearing conditions, i.e. on site.

These underlying limitations of binaural reproduction raise some questions on the possible utilization of binaural recordings in the context of measuring detection time: is this reproduction method realistic enough to provide a reliable assessment of alarm detectability in noise? In the absence of sufficient conclusive evidence in the literature, we decided to evaluate the realism of binaural listening in the context of evaluating detectability of reversing alarms in noise. Binaural listening test results were benchmarked against evaluations performed in a fully immersive sound field reproduction of an acoustic scenario, used as a simplified simulation of the in-situ acoustic reality.

In this paper, the fully immersive reproduction is based on a more general yet adapted sound-field reconstruction (SFR) method with wave field synthesis (WFS) using point source distributions (i.e. the loudspeakers). Gains, and phases are adjusted for each frequency and each time to reproduce the target sound field as measured by a microphone array [8]. Gains and phases are adjusted on the basis of multichannel system inversion, where the inverted system is made of all the transfer paths from the loudspeakers to the microphones of the measuring microphone array.

The initial hypothesis to be verified was that detection of alarm signals performed in a fully immersive sound field reproduction is expected to be different from the binaural evaluation as the test-subjects can move their head inside the area bounded by the loudspeaker array, which is not possible for binaural recording and playback without head tracking.

The paper is organized as follows: Section 2 presents the methodology, while results are presented in Section 3 and discussed in Section 4. Conclusions and future works are detailed in Section 5.

2 Method

2.1 Participants

Twenty individuals having hearing threshold below 25 dB HL (Hearing Level) participated in the first experiment carried out at the Laboratoire Vibrations Acoustique (LVA) in INSA-Lyon, France. Ten participants were master students and ten were staff of the laboratory.

Twenty-three individuals having hearing threshold below 25 dB HL participated in the second experiment carried out at the Groupe d’Acoustique de l’Université de Sherbrooke (GAUS), Canada. The subjects who took part in the second experiment were all distinct from those of the first experiment. Eleven participants were graduate students, nine were interns and three were staff of the laboratory.

This study was carried out in accordance with the Declaration of Helsinki and was reviewed and approved by the Comité d’éthique pour la recherche, the Internal review Board at Université de Sherbrooke, Québec, Canada. Informed consent was obtained from all participants before they were enrolled in the study.

2.2 Experimental procedure

Two sets of experiments were conducted to investigate the influence of the spatial sound reproduction approach on the detectability of reversing alarms in laboratory environments. The first set of experiments, conducted in Canada at GAUS, aimed to assess the detection of reversing alarms using SFR with WFS with a loudspeaker array. WFS was selected as the reference baseline since it is a physical approach to spatial audio while relying on sound field reproduction paradigm. Besides, the accuracy of the reproduced sound field was also verified by physical means and microphone array. The second set of experiments, conducted in France at the LVA, aimed to assess the detection of reversing alarms using headphones playback of binaural recordings inside a double-walled audiometric booth.

In both experiments, the participants remained seated and had to track a 2 cm2 moving on a computer screen. The square remained fixed during a time randomly selected between 0.8 and 1.5 s after which the position changed randomly. The purpose of this task was to draw the participant’s attention away from the main task of the test, that is, alarm sound detection, as it is the case in real-life situations. Simultaneously, participants had to detect the sound of an increasingly louder back-up alarm as soon as it became audible by pressing a key on a computer keyboard. Detection times (with respect to a known increasing level alarm sequence) were stored in the computer running the experiment and were used to compute the corresponding detection levels. A total of 2 × 10 tonal alarms and 2 × 10 broadband alarms (cf. Tab. 1, A and B correspond to two sequences of the same alarm) were presented to each participant. The sound environment (excluding the alarms) was played at 75 dB and presented over headphones or loudspeakers, are detailed in the next section.

Table 1

Sequences of the alarms used during the experiments conducted at the GAUS and at the LVA. The alarm start times are formated using the notation HH:mm:ss.SS.

2.3 Sound environment and sound capture

2.3.1 On-site recordings

In-situ recordings were used for the generation of reproduced sound environment. Measurements of the target sound field were performed using a custom microphone array at an open-air lime mine with moving and stationary large machinery as well as in a factory. Recordings were performed at five different locations on the site. Pictures of the measurements in the lime mine are shown in Figure 1.

thumbnail Figure 1

Sound environment capture in an open-air lime mine using a double-layer circular microphone array. Recordings were performed near a mine crusher (left) and with trucks passing at few meters from the array (right), among others.

The microphone array used for the sound field capture consists of a circular double-layer (alternating between inner and outer radius) array of 1.23 m inner radius and 1.27 m outer radius [9]. The array is made of 85 custom-built microphones and preamplifiers. Five of the microphones are located inside the circular region, one of which is used as the main reference and is located at the center of the circular array. The microphone array was calibrated for the on-site measurements prior to the measurements. All recordings were done with a sampling rate of 48 kHz.

2.3.2 Stimuli generation

The sound field reproduction was performed in a room at Université de Sherbrooke equipped with a square array of 96 loudspeakers of approximately 4 m by 4 m, at 1.55 m above the ground. Adjacent loudspeakers are separated by a distance of 16.25 cm. Four subwoofers, used to generate the frequency content below 120 Hz, are located in the four corners of the square loudspeaker array. Therefore, it is not expected that the reproduction will be spatially accurate below 120 Hz. The subwoofer signals are derived as a downmix of the corresponding 24-loudspeaker bars.

A 30-minute sound environment was designed from the overlay of several sound environments captured in the factory and the mining site, using the microphone array described in Section 2.3.1. The superposition of these different environments was aimed at obtaining a relatively realistic stationary, broadband factory sound environment [the average fluctuation strength, computed using ArtemiS SUITE with Psychoacoustics Module (HEAD acoustics, Herzogenrath, Germany) was inferior to 0.45 mvacil in every critical band of hearing].

Then, the virtual sound environment was reproduced using the loudspeaker array. The loudspeaker driving signals were obtained by solving a multi-channel equalization problem with the microphone array now placed in the center of the loudspeaker array. The loudspeaker inputs are calculated at each frequency so as to provide the same complex sound pressures at the microphone locations as measured on site. In the multi-channel inversion, the loudspeakers are assumed to behave as point sources in free-field conditions [9]. By doing so, the reproduced sound environment is accurate, in terms of sound pressure, for the entire area covered by the microphone array, i.e. a circle of about 1.3 m in diameter. An average sound pressure level of 75 dB lin (± 1 dB) was measured, in the center of loudspeaker array, using a SQuadriga II recorder with a long averaging time.

This 30-minute sound environment (Fig. 2) was split into four segments of about 7–8 min to provide several break periods to participants. Each of these four segments was mixed with 10 alarm-segments of 20 s each. An alarm-segment consisted of 20 alarms of 0.5 s each separated by 0.5 s of silence. All alarms of the same alarm-segment were of the same type: either tonal or broadband. They were mixed with the sound environment such that the interval between two alarm-segments was set between 10 s and 40 s duration. The start of each alarm-segment was randomly placed in the 7–8 min sound segment. Finally, the direction of arrival of each alarm-segment was randomly drawn between front, back, right, and left. These directions of arrival were reproduced with a single loudspeaker from the array, for a strictly physical position (i.e. not virtual) of the alarm sound source. It is important to note that in our case, the alarms stimuli, either tonal (Fig. 3, left) or broadband (Fig. 3, right), were synthesized for a total control of the time and frequency parameters. The sequences for each type of alarm are presented in Table 1.

thumbnail Figure 2

Frequency spectra of the sound environment generated using sound-field reconstruction (SFR) method with wave field synthesis (WFS). The spectra were recorded binaurally using a a head and torso simulator equiped with intra-auricular microphones from an average over 30 s of signal. The grey and black curves correspond respectively to the signal recorded using the right and the left ear of the head and torso simulator.

thumbnail Figure 3

Frequency spectrum of the tonal (left) and broadband (right) reversing alarms. Both alarm signals include frequency components in the 500 Hz to 2500 Hz frequency range, as required by ISO 7731:2003 [10].

To simulate that a vehicle was backing up towards the participant, the overall alarm level was increased step-by-step using a 1 dB step every second. The signal (alarm sound) to background noise ratio varied therefore between −30 dB to −10 dB for each alarm segment.

As the same stimuli were chosen to be presented using binaural sound and headphones during the second set of experiments, binaural recordings of this auditory laboratory scene were performed using a G.R.A.S. KEMAR head and torso simulators type 45BA placed at the centre of the loudspeaker array. The ears of the manikin were placed in the same plane as the loudspeakers. The manikin was equipped with G.R.A.S. large ears type KB0065 (right ear) and KB0066 (left ear). Both ears were embedded with G.R.A.S. 1/2″ prepolarized pressure microphones 40AD and G.R.A.S. preamps type 26CB. This recording was then later used for binaural reproduction.

2.3.3 Stimuli presentation

At the GAUS, stimuli were presented using a 96-loudspeaker system while the participants remained seated at the centre of the loudspeaker array with their head approximately in the loudspeaker plane. At the LVA, stimuli were presented binaurally using Sennheiser HD600 electrodynamic headphones while the participants remained seated on a chair inside a double-walled audiometric booth. The recorded signals were filtered in order to compensate the frequency response of the headphones.

3 Results

Figure 4 presents the average SNR for each detected alarm, computed from the average detection time for both sets of experiments using the following formula:

(1)

where SNRalarm (t = t0) corresponds to the lowest initial SNR for the alarm presented to the participant (i.e., −30 dB), Δpalarm corresponds to the increment of the SPL of the alarm presented to the participant (i.e., +1 dB/s), and tdetection corresponds to the alarm detection time. Individual detection times, undetected alarms scores (i.e., the percentage of missed alarm) and click scores (i.e., the percentage of click on the square moving on the monitor), obtained for both sets of experiments are shown in Tables 2 and 3.

thumbnail Figure 4

Influence of the sound reproduction method on the detectability of tonal and broadband alarm sounds.

Table 2

Average detection times of reversing alarms using SFR with WFS.

Table 3

Average detection times of reversing alarms using binaural reproduction with headphones.

4 Discussion

The low target detection error rates, referred as the “undetected alarm” scores in Tables 2 and 3, demonstrate that participants were able to easily detect each alarm from the background noise during the experiments (mean = 1.63% for the SFR with WFS set of experiments and 2.50% for the binaural set of experiments). Furthermore, the high click scores (mean = 94.08% for the SFR with WFS set of experiments and 94.81% for the binaural set of experiments) confirm that the participants’ attention was correctly drawn on the target detection task.

SNR results presented in Figure 4 indicate that the detection threshold values of the different alarm conditions are statistically significantly different, regardless of the sound reproduction method or the type of alarm. Indeed, this observation is strongly supported by a Friedman Rank Sum Test procedure [11], computed using R 3.6.1 with MASS 7.3–51.4 [12], which rejects the null hypothesis at a 1% significance level (p = 6.376 × 10−11), confirming that detection threshold depends on both the type of alarm and the method of sound reproduction.

Additionally, detection time results from Tables 2 and 3 suggest that the tonal alarm is detected earlier than the broadband alarm, regardless of the presentation method (ΔSFR with WFS ≈ −2.7 s, Δbinaural reproduction ≈ −1.0 s). Such an observation was expected because tonal alarms have their energy focused on a narrower frequency band than broadband alarms. Therefore, for the same SPL, tonal alarms are easier to detect than broadband alarms. Wilcoxon rank sum tests strongly support this observation as they reject the null hypothesis at a 1% significance level, confirming that the average detection time of a broadband alarm presented using SFR with WFS is higher than the average detection time of a tonal alarm presented using SFR with WFS (p = 4.2945e-6). Similarly, it is also confirmed that the average detection time of a broadband alarm presented using binaural reproduction with headphones is higher than the average detection time of a tonal alarm presented using binaural reproduction with headphones (p = 4.7014e-3).

Also, presenting the stimuli using SFR with WFS reduces the average detection time compared to binaural presentation, for the two types of alarms (Δtonal ≈ −3.8 s, Δbroadband ≈ −2.2 s). This observation is strongly supported by Wilcoxon rank sum tests which reject the null hypothesis at a 1% significance level, confirming that the average detection time of a tonal alarm presented using binaural reproduction with headphones is higher than the average detection time of tonal alarm presented using SFR with WFS (p = 8.2744e-8). Similarly, it is also confirmed that the average detection time of a broadband alarm presented using binaural reproduction with headphones is higher than the average detection time of broadband alarm presented using SFR with WFS (p = 3.3335e-6).

These differences between the SFR with WFS results and the binaural results were expected. Indeed, in the SFR with WFS experiments, participants were able to move their heads to exploit the spatial variations of the sound field (more marked in the tonal case), thus facilitating alarm detection during signal presentations compared to the binaural experiments.

5 Conclusion and future work

For replicability, safety, and economical reasons, performing alarm detectability tests in situ is a challenge. Consequently, such evaluations are classically performed in a laboratory environment, using stimuli recorded with a dummy-head and presented through headphones. However, assessing alarm detection using such a static binaural technique is not optimal since participants cannot exploit the spatial variations of the sound field by moving their head, which can greatly improve the localization of the sound to be detected [1317].

In this paper, we benchmarked binaural alarm detection tests against those performed in a spatial sound field reproduction of the original scene using a loudspeaker array and SFR with WFS. Our results suggest that tonal alarms are detected earlier than broadband alarms, probably because the energy of tonal alarms is concentrated in a narrower frequency band compared to broadband alarms. Furthermore, alarms presented using a loudspeaker array with SFR with WFS had a lower detection threshold than when presented using headphones, regardless of the type of alarms (tonal or broadband). The proposed explanation is that, in the SFR with WFS experiments, participants were able to move their heads to exploit the spatial variations of the sound field (more marked in the tonal case), thus facilitating alarm detection during signal presentations compared to the binaural experiments.

Future work will include binaural testing with head tracking to confirm whether binaural listening introduce a bias when evaluating alarm detection in noisy environment, because of the lack of head movement.

Conflict of interest

The authors declare no conflict of interest.

Acknowledgments

The authors would like to thank all the participants for giving their time to take part in this study. The authors also wish to express their appreciation to the “Institut de recherche Robert-Sauvé en santé et sécurité au travail” (IRSST), Montréal, Canada, in contributing to this research work. This study received financial support from the International Research Project (IRP) “Jacques Cartier”, France, and from the “Centre Lyonnais d’Acoustique” (ANR-10-LBX-060), France.

References

  1. Health and Safety Executive: Improving the safety of workers in the vicinity of mobile plant. 2001. Report Number 358, Retrieved from http://www.hse.gov.uk/research/crr_pdf/2001/crr01358.pdf (Accessed September 3, 2019). [Google Scholar]
  2. J.P. Purswell, J.L. Purswell: The effectiveness of audible backup alarms as indicated by Osha accident investigation records. Advances in Occupational Ergonomics and Safety 4 (2001) 444–452. [Google Scholar]
  3. C. Laroche, M.J. Ross, L. Lefebvre, R. Larocque: Determination of the optimal acoustic characteristics of backup alarms. Research Report R-117/IRSST [In French], Montreal, Canada. 1995. [Google Scholar]
  4. V. Vaillancourt, H. Nélisse, C. Laroche, C. Giguère, J. Boutin, P. Lafferrière: Comparison of sound propagation and perception of three types of backup alarms with regards to worker safety. Noise and Health 15, 67 (2013) 420–436. [CrossRef] [PubMed] [Google Scholar]
  5. H.P. Morgan, R.J. Peppin: Noiseless and safer back-up alarms, 2008, in Proceedings of the NOISE-CON 2008 conference from The Institute of Noise Control Engineering on July 28–31, 2008, Dearborn, Michigan, USA, 8 p. [Google Scholar]
  6. V. Vaillancourt, H. Nélisse, C. Laroche, C. Giguère, J. Boutin, P. Lafferrière: Safety of workers behind heavy vehicles – assessment of three types of reverse alarm. Research Report R-833/IRSST, Montreal, Canada. 2012. [Google Scholar]
  7. D.R. Begault, E.M. Wenzel, M.R. Anderson: Direct comparison of the impact of head tracking, reverberation, and individualized head-related transfer functions of the spatial perception of a virtual speech source. Journal of the Audio Engineering Society 49 (2001) 904–916. [Google Scholar]
  8. P.A. Gauthier, C. Camier, F.A. Lebel, Y. Pasco, A. Berry, J. Langlois, C. Verron, C. Guastavino: Experiments of multichannel least-square methods for sound field reproduction inside aircraft mock-up: Objective evaluations. Journal of Sound and Vibration 376, 18 (2016) 194–216. [CrossRef] [Google Scholar]
  9. A. Berry, P.-A. Gauthier, H. Nélisse, F. Sgard: Reproduction of industrial sound environments applicable to audibility studies on alarms and other sound signals, in The Context of Occupational Health and Safety: Proof of Concept. Research Report R-937/IRSST [In French], Montreal, Canada. 2016. [Google Scholar]
  10. ISO: ISO 7731: 2003 Ergonomics – danger signals for public and work areas – auditory danger signals. International Organization for Standardization, Geneva. 2003. [Google Scholar]
  11. W.D. Wayne: Friedman two-way analysis of variance by ranks. Applied non parametric statistics. 2nd ed., PWS-Kent, Boston. 1990, p. 503. [Google Scholar]
  12. R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2014. [Google Scholar]
  13. W.R. Thurlow, J.W. Mangels, P.S. Runge: Head movements during sound localization. Journal of the Acoustical Society of America 42 (1967) 489–493. [CrossRef] [PubMed] [Google Scholar]
  14. W.R. Thurlow, P.S. Runge: Effect of induced head movementson localization of direction of sounds. Journal of the Acoustical Society of America 42 (1967) 480–488. [CrossRef] [PubMed] [Google Scholar]
  15. S. Perrett, W. Noble: The contribution of head motion cues to localization of low-pass noise. Perception & Psychophysics 59 (1997) 1018–1026. [CrossRef] [PubMed] [Google Scholar]
  16. S. Perrett, W. Noble: The effect of head rotations on vertical plane sound localization. Journal of the Acoustical Society of America 102 (1997) 2325–2332. [CrossRef] [PubMed] [Google Scholar]
  17. M. Kato, H. Uematsu, M. Kashino, T. Hirahara: The effect of head motion on the accuracy of sound localization. Acoustical Science and Technology 24 (2003) 315–317. [CrossRef] [Google Scholar]

Cite this article as: Valentin O. Grandjean P. Girin C. Gauthier P-A. Berry A, et al. 2023. Influence of sound spatial reproduction method on the detectability of reversing alarms in laboratory conditions. Acta Acustica, 7, 9.

All Tables

Table 1

Sequences of the alarms used during the experiments conducted at the GAUS and at the LVA. The alarm start times are formated using the notation HH:mm:ss.SS.

Table 2

Average detection times of reversing alarms using SFR with WFS.

Table 3

Average detection times of reversing alarms using binaural reproduction with headphones.

All Figures

thumbnail Figure 1

Sound environment capture in an open-air lime mine using a double-layer circular microphone array. Recordings were performed near a mine crusher (left) and with trucks passing at few meters from the array (right), among others.

In the text
thumbnail Figure 2

Frequency spectra of the sound environment generated using sound-field reconstruction (SFR) method with wave field synthesis (WFS). The spectra were recorded binaurally using a a head and torso simulator equiped with intra-auricular microphones from an average over 30 s of signal. The grey and black curves correspond respectively to the signal recorded using the right and the left ear of the head and torso simulator.

In the text
thumbnail Figure 3

Frequency spectrum of the tonal (left) and broadband (right) reversing alarms. Both alarm signals include frequency components in the 500 Hz to 2500 Hz frequency range, as required by ISO 7731:2003 [10].

In the text
thumbnail Figure 4

Influence of the sound reproduction method on the detectability of tonal and broadband alarm sounds.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.