Issue |
Acta Acust.
Volume 7, 2023
Topical Issue - Audio for Virtual and Augmented Reality
|
|
---|---|---|
Article Number | 59 | |
Number of page(s) | 2 | |
DOI | https://doi.org/10.1051/aacus/2023053 | |
Published online | 17 November 2023 |
Short Communication
Communication acoustics and electrical-hearing performance: Can artificial intelligence be of help?★
Ruhr-Universität Bochum, 44780 Bochum, Germany
* Corresponding author: jens.blauert@rub.de
Received:
13
September
2023
Accepted:
30
September
2023
Acoustic or electrical-hearing skills are indispensable for auditory communication. However, sensory auditory perception alone is not sufficient for effective communication. We know from cognitive psychology that:
People do not react according to what they hear, but rather react to what they hear is meaning to them in their current action-specific, emotional, and cognitive situation ─ a much more complex process than just sensory perception.
This means that, in addition to the form of the signal streams that arrive at the listeners, their function, i.e., the meaning they convey, is of utmost behavioral significance.
From semiotics, we learn that all communication with and within the environment occurs via signal streams that denote specific meanings. These meaning-bearing signal streams are called “signs” in semiotics. Signs are often classified according to their level of abstraction [1], namely as indices, icons, or symbols. Indices are understood immediately, e.g., the cracking sound of a fire. Icons contain significant attributes of specific meanings, neglecting irrelevant details, e.g., the characteristic siren sound indicates an emergency vehicle. Symbols are signs whose meaning must first be learned, e.g., Morse-code beeps, which stand for letters or numerals.
To communicate meanings, it is necessary to code the signal streams sent to the listeners into clearly identifiable signs.
The proposal made in this short communication is now as follows: Instead of leaving it to the listeners to extract meanings from the corrupted signals that they receive, we leave the meaning extraction to an artificial intelligence (AI) system, in other words, a meaning recognizer that works on the uncorrupted acoustic signal streams.
Meaning recognizers are currently a major topic of AI research, and their performance is rapidly increasing. For limited sets of meanings, such systems are already readily available, such as the AI-supported spoken-language-dialog systems Alexa (Google) or Siri (Apple). Even for un-constrained sets of meanings, such systems make rapid progress, powered by self-learning systems with increasing “world knowledge”, e.g., the generative pre-trained transformer systems ChatGPT (OpenAI) or Bing (Microsoft).
Also, meaning recognizers are not limited to speech sounds as input signals. They can be trained to deal with sounds of different type (noise recognizers). Further, sensor-output signals can be recognized, interpreted and then communicated in an application-related manner via auditory symbols (Hearcons) as instructions for action, e.g., for navigation purposes.
To explore these ideas further, answers to the following questions are helpful. The reader is invited to consider them.
Would
in electrical hearing, augmented auditory reality enhance interaction with the environment?
artificial-intelligence software that automatically assigns meaning to speech-signal streams (meaning recognizers), be helpful in electrical hearing?
auditory signs that denote meaning (indices, icons, symbols) improve the communication performance when transmitted via carrier signals tailored to the individual’s available communication channels.
If we want to convey meaning, it is instrumental to code them into signs, in other words, into uniquely identifiable perceptual entities that denote specific meanings.
In this context, it is advantageous to optimize the streams of signals that are interpreted by the receivers as signs in such a way that the best possible match is achieved with the specifications of the specific transmission channels that are available to each listener. And these can be very limited indeed, e.g., in cases where cochlear implants do not work adequately, or with brainstem implants.
One possible implementation of this idea could be to present listeners who hear electrically with an augmented auditory virtual environment that includes computer-generated or at least computer-modified auditive signs. Of course, the listeners should be aware that a computer is talking, as this will modify their interpretation of the meanings of the signs. By the way, such behavior is also common in the “real” world. People interpret the speech of a child differently from that of an adult, and that of a layperson differently from that of an expert.
Given this situation, it is worth keeping in mind that in auditory communication audition and cognition always go together! (Fig. 1).
Figure 1 Audition and cognition come in couples. |
Simplifying the auditory world in electrical hearing with a strong focus on the meaning, i.e., the function, would thus improve the performance of electrical hearing, particularly in complicated cases.
Consequently, it is indispensable to increase our efforts in exploring the cognitive aspects of electrical-hearing performance. There is no doubt:
For proper communication, it is the MEANING that ultimately counts!
References
- U. Jekosch: Assigning meaning to sounds − semiotics in the context of product-sound design, in: J. Blauert (Ed.), Communication Acoustics, Springer, Berlin, Heidelberg, New York, 2005. [Google Scholar]
© The Author(s), Published by EDP Sciences, 2023
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
All Figures
Figure 1 Audition and cognition come in couples. |
|
In the text |
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.