Using a blind EC mechanism for modelling the interaction between binaural and temporal speech processing

Topical Issue - Auditory models: from binaural processing to multimodal cognition

Open Access

Issue		Acta Acust. Volume 6, 2022 Topical Issue - Auditory models: from binaural processing to multimodal cognition


Article Number		21
Number of page(s)		14
DOI		https://doi.org/10.1051/aacus/2022009
Published online		26 May 2022

E.C. Cherry: Some experiemnts on the recognition of speech, with one and with two ears. The Journal of the Acoustical Society of America 25 (1953) 975–979. [CrossRef] [Google Scholar]
A.W. Bronkhorst: The cocktail Party Phenomenon: A review of research on speech intelligibility in multiple talker conditions. Acta Acustica United with Acustica 86, 1 (2000) 117–128. [Google Scholar]
L. Rayleigh: On our perception of sound direction. Philosophical Magazine 13 (1907) 214–223. [Google Scholar]
N.I. Durlach: Equalization and cancellation theory of binaural masking-level differences. The Journal of the Acoustical Society of America 35, 8 (1963) 1206–1218. [CrossRef] [Google Scholar]
A.H. Andersen, J.M. De Haan, Z.H. Tan, J. Jensen: Predicting the intelligibility of noisy and nonlinearly processed binaural speech. IEEE/ACM Transactions on Audio Speech and Language Processing 24, 11 (2016) 1908–1920. [CrossRef] [Google Scholar]
R. Beutelmann, T. Brand: Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners. The Journal of the Acoustical Society of America 120, 1 (2006) 331–342. [CrossRef] [PubMed] [Google Scholar]
R. Beutelmann, T. Brand, B. Kollmeier: Revision, extension, and evaluation of a binaural speech intelligibility model. The Journal of the Acoustical Society of America 127, 4 (2010) 2479–2497. [CrossRef] [PubMed] [Google Scholar]
C.F. Hauth, S.C. Berning, B. Kollmeier, T. Brand: Modelling binaural unmasking of speech using a blind binaural processing stage. Trends in Hearing 24 (2020) 1–16. [Google Scholar]
S. Jelfs, J.F. Culling, M. Lavandier: Revision and validation of a binaural model for speech intelligibility in noise. Hearing Research 275, 1–2 (2011) 96–104. [CrossRef] [PubMed] [Google Scholar]
M. Lavandier, J.F. Culling: Prediction of binaural speech intelligibility against noise in rooms. The Journal of the Acoustical Society of America 127, 1 (2010) 387–399. [CrossRef] [PubMed] [Google Scholar]
M. Lavandier, S. Jelfs, J.F. Culling, A.J. Watkins, A.P. Raimond, S.J. Makin: Binaural prediction of speech intelligibility in reverberant rooms with multiple noise sources. The Journal of the Acoustical Society of America 131, 1 (2012) 218–231. [CrossRef] [PubMed] [Google Scholar]
R. Wan, N. Durlach, H.S. Colburn: Application of an extended equalization-cancellation model to speech intelligibility with spatially distributed maskers. The Journal of the Acoustical Society of America 128, 6 (2010) 3678–3690. [CrossRef] [PubMed] [Google Scholar]
E.L.J. George, S.T. Goverts, J.M. Festen, T. Houtgast: Measuring the effects of reverberation and noise on sentence intelligibility for hearing-impaired listeners. Journal of Speech, Language, and Hearing Research 53, 6 (2010) 1429–1439. [CrossRef] [PubMed] [Google Scholar]
S. Hochmuth, T. Jürgens, T. Brand, B. Kollmeier: Talker- and language-specific effects on speech intelligibility in noise assessed with bilingual talkers: Which language is more robust against noise and reverberation? International Journal of Audiology 54, March 2016 (2015) 23–34. [CrossRef] [PubMed] [Google Scholar]
J. Rennies, T. Brand, B. Kollmeier: Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet. The Journal of the Acoustical Society of America 130, 5 (2011) 2999–3012. [CrossRef] [PubMed] [Google Scholar]
A. Warzybok, J. Rennies, T. Brand, S. Doclo, B. Kollmeier: Effects of spatial and temporal integration of a single early reflection on speech intelligibility. The Journal of the Acoustical Society of America 133, 1 (2013) 269–282. [CrossRef] [PubMed] [Google Scholar]
I. Arweiler, J.M. Buchholz: The influence of spectral characteristics of early reflections on speech intelligibility. The Journal of the Acoustical Society of America 130, 2 (2011) 996–1005. [CrossRef] [PubMed] [Google Scholar]
J.S. Bradley, H. Sato, M. Picard: On the importance of early reflections for speech in rooms. The Journal of the Acoustical Society of America 113, 6 (2003) 3233. [CrossRef] [PubMed] [Google Scholar]
J.P.A. Lochner, J.F. Burger: The influence of reflections on auditorium acoustics. Journal of Sound and Vibration 1, 4 (1964) 426–454. [CrossRef] [Google Scholar]
J. Rennies, A. Warzybok, T. Brand, B. Kollmeier: Measurement and prediction of binaural-temporal integration of speech reflections. Trends in Hearing 23 (2019) 1–22. [Google Scholar]
ANSI: ANSI S3.5-1997, American national standard methods for calculation of the speech intelligibility index. Am. Natl. Stand. Institute, New York, 1997. [Google Scholar]
T. Leclère, M. Lavandier, J.F. Culling: Speech intelligibility prediction in reverberation: Towards an integrated model of speech transmission, spatial unmasking, and binaural de-reverberation. The Journal of the Acoustical Society of America 137, 6 (2015) 3335–3345. [CrossRef] [PubMed] [Google Scholar]
J. Rennies, A. Warzybok, T. Brand, B. Kollmeier: Modeling the effects of a single reflection on binaural speech intelligibility. The Journal of the Acoustical Society of America 135, 3 (2014) 1556–1567. [CrossRef] [PubMed] [Google Scholar]
S. Cosentino, T. Marquardt, D. McAlpine, J.F. Culling, T.H. Falk: A model that predicts the binaural advantage to speech intelligibility from the mixed target and interferer signals. The Journal of the Acoustical Society of America 135, 2 (2014) 796–807. [CrossRef] [PubMed] [Google Scholar]
M. Geravanchizadeh, A. Fallah: Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners. The Journal of the Acoustical Society of America 138, 6 (2015) 4004–4015. [CrossRef] [PubMed] [Google Scholar]
M. Dietz, S.D. Ewert, V. Hohmann: Auditory model based direction estimation of concurrent speakers from binaural signals. Speech Communication 53, 5 (2011) 592–605. [CrossRef] [Google Scholar]
T. Dau, D. Püschel, A. Kohlrausch: A quantitative model of the “effective” signal processing in the auditory system. I. Model structure. The Journal of the Acoustical Society of America 99, 6 (1996) 3615–3622. [CrossRef] [PubMed] [Google Scholar]
H. Sakoe, S. Chiba: Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing 26, 1 (1978) 43–49. [CrossRef] [Google Scholar]
H.J.M. Steeneken, T. Houtgast: A physical method for measuring speech-transmission quality. The Journal of the Acoustical Society of America 67, 1 (1980) 318–326. [CrossRef] [PubMed] [Google Scholar]
I. Holube, B. Kollmeier: Speech intelligibility prediction in hearing-impaired listeners based on a psychoacoustically motivated perception model. The Journal of the Acoustical Society of America 100, 3 (1996) 1703–1716. [CrossRef] [PubMed] [Google Scholar]
C.H. Taal, R.C. Hendriks, R. Heusdens, J. Jensen: An algorithm for intelligibility prediction of time – frequency weighted noisy speech. IEEE Transaction on Audio, Speech, and Language Processing 19, 7 (2011) 2125–2136. [CrossRef] [Google Scholar]
A.H. Andersen, J.M. Haan, Z. Tan, J. Jensen: A non-intrusive short-time objective intelligibility measure, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, United States, 5 March, 2017, pp. 5085–5089. [Google Scholar]
B. Kollmeier, A. Warzybok, S. Hochmuth, M.A. Zokoll, V. Uslar, T. Brand, K.C. Wagener: The multilingual matrix test: Principles, applications, and comparison across languages: A review. International Journal of Audiology 54, December (2015) 3–16. [CrossRef] [PubMed] [Google Scholar]
T. Brand, B. Kollmeier: Efficient adaptive procedures for threshold and concurrent slope estimates for psychophysics and speech intelligibility tests. The Journal of the Acoustical Society of America 111, 6 (2002) 2801–2810. [CrossRef] [PubMed] [Google Scholar]
V. Hohmann: Frequency analysis and synthesis using a Gammatone filterbank. Acta Acustica United with Acustica 88, 3 (2002) 433–442. [Google Scholar]
B.C.J. Moore, B.R. Glasberg: Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. Journal of the Acoustical Society of America 74, 3 (1983) 750–753. [CrossRef] [PubMed] [Google Scholar]
H. vom Hövel: Zur Bedeutung der Übertragungseigenschaften des Aussenohrs sowie des Binauralen Hörsystems bei Gestörter Sprachübertragung [On the importance of the transmission properties of the outer ear and the binaural auditory system in disturbed speech transmission]. [PhD dissertation]. RWTH Aachen, Aachen, Germany, 1984. [Google Scholar]
J.F. Santos, M. Senoussaoui, T.H. Falk: An improved non-intrusive intelligibility metric for noisy and reverberant speech, in 2014 14th International Workshop on Acoustic Signal Enhancement, IWAENC 2014, Juan-les-Pins, France, September 8–11, 2014, pp. 55–59. [Google Scholar]
A.H. Andersen: Speech Intelligibility Predictors. Retrieved date: 2nd May 2022. http://ah-andersen.net/code/. [Google Scholar]
K. Wagener, J.L. Josvassen, R. Ardenkjær: Design, optimization and evaluation of a Danish sentence test in noise. International Journal of Audiology 42, 1 (2003) 10–17. [CrossRef] [PubMed] [Google Scholar]
J.S. Garofolo, L.F. Lamel, W.M. Fisher, J.G. Fiscus, D.S. Pallett, N.L. Dahlgren, V. Zue: TIMIT Acoustic-phonetic continous speech corpus. LDC93S1. Web Download. Lingistic Data Consortium, Philadelphia, 1993. [Google Scholar]
D. Hülsmeier, C.F. Hauth, S. Rӧttges, P. Kranzusch, J. Roßbach, M.R. Schädler, B.T. Meyer, A. Warzybok, T. Brand: Towards non-intrusive prediction of speech recognition thresholds in binaural conditions, in Speech Communication; 14th ITG Conference, Kiel, Germany, 29 September – 1 October, 2021, pp. 1–5. [Google Scholar]
J. Roßbach, S. Rӧttges, F.C. Hauth, T. Brand, B.T. Meyer: Non-intrusive binaural prediction of speech intelligibility based on phoneme classification, in ICASSP 2021 – 2021 IEEE International Conference on Acoustics, Speech and Signal Processing, Toronto, ON, Canada, 6–11 June, 2021, pp. 369–400. [Google Scholar]
W. Schubotz, T. Brand, B. Kollmeier, S.D. Ewert: Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features. The Journal of the Acoustical Society of America 140, 1 (2016) 524–540. [CrossRef] [PubMed] [Google Scholar]
S. Jørgensen, S.D. Ewert, T. Dau: A multi-resolution envelope-power based model for speech intelligibility. The Journal of the Acoustical Society of America 134, 1 (2013) 436–446. [CrossRef] [PubMed] [Google Scholar]
K.S. Rhebergen, N.J. Versfeld, W.A. Dreschler: Extended speech intelligibility index for the prediction of the speech reception threshold in fluctuating noise. The Journal of the Acoustical Society of America 120, 6 (2006) 3988–3997. [CrossRef] [PubMed] [Google Scholar]
C.F. Hauth, T. Brand: Modeling sluggishness in binaural unmasking of speech for maskers with time-varying interaural phase differences. Trends in Hearing 22 (2018) 1–10. [Google Scholar]
H. Hermansky, E. Variani, V. Peddinti: Mean temporal distance: Predicting ASR error from temporal properties of speech signal, in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing – Proceedings, Vancouver, BC, Canada, 26–31 May, 2013. [Google Scholar]
M.R. Schädler, D. Hülsmeier, A. Warzybok, S. Hochmuth, B. Kollmeier: Microscopic multilingual matrix test predictions using an ASR-based speech recognition model, in 17th Annual Conference of the International Speech Communication Association (Interspeech 2016), San Francisco, CA, USA, September 8–12, 2016, pp. 610–614. [Google Scholar]
G. Kidd, C.R. Mason, V.M. Richards, F.J. Gallun, N.I. Durlach: Informational masking, in Auditory Perception of Sound Sources, Yost W.A., Popper A.N., Fay R.R., Editors. New York. Springer. 2008, pp. 143–190. [CrossRef] [Google Scholar]
J. Mi, H.S. Colburn: A binaural grouping model for predicting speech intelligibility in multitalker environments. Trends in Hearing 20 (2016) 1–12. [Google Scholar]
P. Majdak, C. Hollomey, R. Baumgartner: Amt 1.x: A toolbox for reproducible research in auditory modeling. Acta Acustica 6 (2021) 19. https://amtoolbox.org/cite_us.php. [Google Scholar]
The AMT Team: The Auditory Modeling Toolbox Full Package (version 1.x) [Code]. 2021. https://sourceforge.net/projects/amtoolbox/files/AMT%201.x/amtoolbox-full-1.0.0.zip/download. [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.