Interactive real-time auralization of airborne sound insulation in buildings

Sound insulation auralization can be used as valuable tool to study the perceptual aspects of sound transmission in built environments for assessment of noise effects on people. It may help to further develop guidelines for building constructions. One advanced goal of real-time sound insulation auralization is to appropriately reproduce the condition of noise effects on the human perception and cognitive performance in dynamic and interactive situations. These effects depend on the kind of noise signal (i.e. speech, music, traffic noise, etc.) and on the context. This paper introduces a sound insulation auralization model. The sound insulation filters are constructed for virtual buildings with respect to complex sound propagation effects for indoor and outdoor sound sources. The approach considers the source room sound field with direct and diffuse components along with source directivity and position. The transfer functions are subdivided into patches from the source room to the receiver room, which also covers composite building elements, thus providing more detail to the actual building situations. Furthermore, the receiving room acoustics includes the reverberation of the room based on its mean free path, absorption and binaural transfer functions between its radiating walls elements and the listener. Thismore exact approach of sound insulation model agrees reasonably well with the ISO standard (i.e. diffuse field theory) under standard settings. It is also shown that the sound field significantly influences the transmitted energies via building elements depending on the directivity and position of the source. The proposedmethod is validated as a general scheme and includes more details for real-time auralization in specific situations especially in the cases where the simplified diffuse sound field approach fails. It is capable to be used in interactive Virtual Reality (VR) systems, which opens new opportunities for psychoacoustics research in noise effects on human.


Introduction
There is a high concern about growing annoyance due to noise in the built environment. Despite the fact traffic noise is steadily increasing in densely populated urban areas, the building structures and the corresponding guidelines or standards of sound insulation requirements are still very similar to those decades ago [1]. In multifamily apartments and houses, people are annoyed by neighbour noise [2]. Background noise, for example the background speech, in work environments leads to reduced concentration and performance during physical or mental work such as conversations or phone calls which is considered as a negative feature of office environments [3]. The spectral characteristics of background speech from neighbouring rooms highly depend on the sound insulation curves of the building constructions separating the rooms. The basic principle of building acoustics auralization is to simulate the alteration of a sound signal from its source to the receiving end by transmission through building structures [4]. The auralization of an office-to-office situation, for example, where speech spoken in one office is transmitted through an open door or through building structures to a neighbouring office, requires modelling sound propagation in both rooms, i.e. its generation and transmission from walls, and the insulation characteristics of the direct and flanking walls elements between the offices. The auralization technique, in building acoustics, was first introduced by Vorländer and Thaden [5,6] and applied by Schlittmeier et al. [7] in an experiment in auditory cognition on the irrelevant speech effect. Due to the fact that auralization can include excitation sounds and sound propagation models in a flexible way, it can be applied for creation of situation-specific stimuli for psychoacoustic tests of noise effects on human.
So far, such tests included pre-calculated virtual acoustic scenes, in which the test subject is asked to respond without much degree of freedom as concerns movement of head and body, or view directions. The next refinement is to separate the direct and diffuse sound fields in the source room to include directional sources which was introduced by Rodríguez-Molares [8] as an extension of Thaden's work [6]. Acoubat 1 , BASTIAN 2 and SONarchitect 3 are the implementations in commercial software tools. The specific sound pressure field can then be studied concerning its dependence on positions and orientations of sources. All building acoustics simulation methods [5,6,8] include the construction of sound insulation filters which were derived from ISO 12354 (Parts-I and III) [9,10]. Subsequently, these filters were used to calculate sound transmission paths from the source to the receiver, placed in adjacent rooms of a workplace. In [5,6,8], the building elements are compact in a way that a whole wall is represented by one transmission coefficient and as one secondary source with several simplifications. In the receiving room, the simplification in these models was made that the sound apparently radiates from one point (i.e. the centre of the radiating receiver walls) representing the whole (bending) wave pattern on the wall [5]. For large walls this might be a too rough approximation compared with the spatially-distributed incident (transmitted) power on (from) the walls. When it comes to façades with geometric patterns of windows and massive constructions, for example, the sound incidence angles and distances are specific to each element, which is not adequately covered with just one transmission path for the façade. To overcome these restrictions, there is an opportunity to extend the existing models to distributed secondary sources for composite and large finite walls, to include angle dependent transmission and façade sound insulation, and in developing a building acoustics auralization platform integrated in VR systems. This covers the degree of freedom (concerns movement of listener) during interactive perceptual studies of sound insulation and noise effects. This way, studies on sound perception can be performed with more ecologically valid approach [11]. The key to this new approach is an implementation into an auralization framework with overall performance in real time, thus allowing integration in virtual reality systems. This paper presents an approach for interactive real-time binaural filter construction for the sound transmission between the adjacent office rooms separated by the building elements. Sound transmission through façades is presented as a second example study. To achieve more precision in physical features the source room is included with separate direct and diffuse fields, which depend on source room characteristics. The source directivity and spatial variation of the sound field are then be taken into more details, which is assumed to be relevant in particular for outdoor sources and façade sound insulation. Likewise, the transfer functions from source to receiving rooms are calculated using the concept of subdividing the individual building elements (i.e. large walls) into a grid of finite secondary sound sources with energy distributions of wave patterns on the wall elements. Finally, one-third octave band impulse responses (IRs) of source and receiving rooms are synthesised from its reverberation times. Hence, the main difference between the previous and the extended versions of auralization frameworks is that in the extended version the source and the receivers can be placed closer to the room boundaries than what is required for ISO based diffuse field simulation models. In this way, the approach increases realism and plausibility of non-standard situations. It also allows interaction within virtual environments, making a more realistic and immersive scene which can lead towards a tool for advanced subjective evaluations of noise effects in buildings.

Background and related work
Two approaches are commonly used for prediction of sound and vibration transmission in built-up structures. At low frequencies the numerical methods, for example the Finite Element Method (FEM) [12] or semi-analytic methods [13], may provide a quick and efficient calculations of the structural response. These models, however, require computation times which exceed the limits of real-time processing (~50 ms) by orders of magnitude [4][5][6]. For this reason, statistical approaches are used, such as Statistical Energy Analysis (SEA). SEA method is used to calculate the energy exchange between adjacent building elements and the respective energy losses, under steady-state conditions. SEA models predict the average response of ensemble elements of the system, therefore, the coupling loss factors and modal densities represent ensemble average [14]. The international standard series ISO [9,10], for example, are commonly used as guidelines for building constructions for prediction of airborne sound insulation. These documents are based on the pioneering work by Gerretsen [15].
ISO-12354-1 (2017) [9] is commonly used for prediction of airborne sound insulation metrics in frequency-dependent results such as sound reduction index R and the standardised sound level difference D nT . The standardised level difference can also be expressed by the transmission coefficients s ij ¼ 10 À0:1Rij of the transmission path ij between two rooms, see equation (1). Here, i and j denote the source and receiving room wall elements, respectively, for the transmission path ij, with receiving room volume V, and the separating (direct) element area S D between the two rooms: The resulting average sound pressure level in the receiver room can be calculated for all transmission paths by equation (2). By introducing the (non-normalized) sound energies p 2 R and p 2 s as mean squared pressures in the source and receiving room, respectively, equation (2) can be expressed in energetic form given in equation (3) [5]: In these formulae, the radiating elements i.e. the receiving room walls, excite a diffuse field. In the auralization implementations, the radiation from the walls is modelled via secondary sources (SS) (e.g. in [5,6]) which are approximated as point sources at the centre of the walls, floor and ceiling. The balance between direct and reverberant part of sound fields is very important in perception of the spatial characteristics the room. If A is the equivalent absorption area of a room, the energy balance is computed through the ratio of the energies given by the relationship with p 2 dir and p 2 rev as the energies of direct and reverberant fields at a distance r from the sound source. It remains the question of calculation the complex sound pressure p from the sound energy p 2 . For an uncorrelated direct and the reverberant sound field the impact of ijth transmission path to the sound pressure can be described as real parts of the pressure, p R;ij ¼ p R;ij;dir þ p R;ij;rev , in terms of direct and reverberant fields with the imaginary parts set to zero [5]. Therefore, the final sound field consists of direct and diffuse sound fields, whereas, in linear filter notation it is expressed the form of an impulse response, h(t), between source and receiver, similar to that described by Vorländer and Thaden [5,6] and later on adopted by Rodríguez-Molares [8]. This includes the temporal decay of the room responses. Figure 1 shows typical adjacent source and receiving rooms which are considered as an example for synthesis of h(t). A possible technique to synthesize h(t) from the receiving room reverberation time T is to approximate h(t) by using a linear combination of one-third octaveband-filtered exponential decay signals. After calculating energetically normalized impulse response h(t) of receiving room for a radiating element j, at first the direct sound is removed from this impulse response as it is already included in the transmission path calculation. In its binaural form, this yields the term HRIR t À rj c ; h j ; u j À Á . Subsequently, it is equalized to white noise spectrum and normalised in energy. The time domain representation of the binaural signal from source to receiver of the transmission path ij. All binaural contributions from the radiating elements are summed up to get final signal given in equation (4).
In the method described above, various simplifications were made. At first, the transfer functions s ij between ith element of the source room and jth element of the receiver room are valid for point to point transmission only. Secondly, in the receiver room the sound is apparently radiated from one point which represents the whole wave pattern on the wall [5]. The spectrum of the radiated power is exact, however, the wave pattern on the wall element is replaced by a point source at the centre of the wall with a linear phase [6]. Sound radiation from a single source at the center of a plate at low-mid frequencies in most rooms is a useful simplification but it is not suitable for typical rooms where there is not a diffuse sound field below 200 Hz. Another aspect is that the source directivity and distance to the wall are also neglected, which might contribute to specific distributions of sound pressure on the surfaces of the walls of the source room. The amount of transmitted energy would be different for different paths, in particular if sources are placed close to walls (such as loudspeakers or TV sets).

Sound insulation model
In this approach, at first, it is taken into account the source room acoustics by considering a more complex sound field incident on the source room walls consisting of a direct and a diffuse field components, as introduced by Rodríguez-Molares [8]. Here, the sound energy transmitted via direct and flanking paths to the adjacent receiving room is now specifically depending on the sound pressure hitting the corresponding building elements in the source room due to source position and its directivity. Secondly, the influence of reverberation of source and receiving rooms and the balance between direct and reverberant energies inside the receiving room are incorporated into sound insulation transfer functions. These transfer functions are developed for extended radiating walls by using a grid of point sources (known as secondary sources in [5]) based on up-to-date structural acoustics theory and hence provide a more detail for sound transmission through direct as well as flanking finite elements. It is also adopted the procedure from [8] to synthesize the room impulse response h(t), from the reverberation time T, to include the effects of absorption of room boundaries as well as to simulate a plausible real room. In the following sections, we discuss each step in more detail.

Sound source directivity
The energetic source directivity Q s is introduced for computing the sound energy distribution in the source room. As examples, it is illustrated the directivities of a trumpet and a loudspeaker, as shown in Figure 2.

Room impulse response synthesis
According to [8], the synthesis of room impulse response h(t) is based on the reverberation time T and artificial noise representing the sum of reflections. The approximated h(t) is obtained by a linear combination of filtered exponential decay signals. Let us suppose a signal, g(t, T) as given in equation (5) with n(t) as a normally distributed random variable having zero mean and unit standard deviation. The signal g(t, T) decays 60 dB for T, for all frequency bands.
The factor ffiffiffiffiffiffiffi 13:81 T q in equation (5) normalizes g(t, T) in energy. From the linear combinations of filtered signals g(t, T) the impulse response h(t) is synthesized, which has different decay rates for each frequency band (Eq. 6).
Here, T k is the reverberation time, and the function F k (t, k) is a set of band-pass filters in time domain for each kth onethird octave band. The function g(t, T k ) tends to a white spectrum because of a convolution of the Fourier transform of e À6:91t T with a white spectrum of n(t). However, there appear slight variations in the spectrum of h(t), as n(t) is statistical in nature which means that it must be adjusted by a factor a k given in equation (7):

Sound field in the source room
In closed spaces (such as rooms) it can be assumed that the direct sound field propagates as in free-field conditions whereas the reverberant sound field is uniformly distributed [17,18]. This phenomenon is described in classical sound field theory for sound propagation in rooms. A sound source with directivity Q s and sound power level L w produces a sound pressure level L s at distance r, inside a source room with A s , as equivalent absorption area is given equation (8): Equation (8) includes the effects of source room reverberation, directivity of source, and balance between direct and reverberant energy as considered in [6]. The mean squared sound pressure at a point inside the source room, in energetic notations, can then be calculated by equation (9) with W a ¼ 10 À12 10 À0:1L W (source acoustic power in Watt): To design the sound insulation filters, typical rectangular shaped source and receiving rooms are selected as shown in Figure 3. A loudspeaker is selected as an example sound source to analyse the influence of the source directivity on the transmitted energy to the receiving room walls for the direct as well as flanking paths. Note that the model includes segmented wall elements ("patches") instead of computing the power incidence on the wall as a whole. The incident sound power on any wall element i with an area S i in the source room is taken as a combination of direct sound and the diffuse sound field. Let us consider a patch p with surface area S p,i . Under diffuse sound field conditions the reverberation part of the incident sound power W sp,rev , on any patch of any wall element of source room is given by equation (10).
Under free-field conditions, the direct incident sound power W sp,dir is calculated on this patch with equation (11): Q s,p is the source directivity, r p,i is the distance and h p,i is the incidence angle from the source to an infinitesimal small area dS p of the patch. These quantities (Q s,p , r p,i and h p,i ) depend on the room geometries. Let the integral inside equation (11) be represented by as mentioned in [8]. By combining equations (10) and (11), the incident sound power of a single patch on the element i results in the form of equation (12): The integral inside equation (12), represented by F p,i , can be approximated numerically for not very large patches and in not very close positions to the walls. This integral is obtained by assuming that Q s,p , r p,i and h p,i do not vary considerably along the surface S p,i . Therefore, these factors can be taken out of the integral in equation (13). This approximation was introduced by [8], and it is appropriate after the wall has been subdivided into patches, thus relaxing uniform conditions on the surface: The vector r p,i is the distance from the source to the centre of patch p, with an incidence angle h p,i and Q s,p denotes mean directivity value for h p,i . In this method, the integral F p,i is calculated by the adaptive Simpson's integration method. This approach from [8] is extended by one more step. After calculating the exponential function from the energetically normalized impulse response h(t) (Eq. (6), the first part of the exponential decay is removed from this impulse response. The time reference (t = 0) is the time of source emission, the direct sound component is implemented at the time corresponding to the distance between the source and the receiving point. The reverberant part of the room impulse response starts at about the inverse mean free path ( " t ¼ 4V cS ), which is the averaged time elapsed for sound travelling between two reflections [19]. Here, V is the volume and S is the surface area of the room as shown in Figure 3. Subsequently, it is equalized to white spectrum and normalised in energy. The resulting impulse response is denot ed by h sp,i (t) (Eq. (14)).
It contains the room response without the direct sound, whereas, the first reflection arrives at 7.5 ms as can be seen from Figure 4. The incident power on patch p, given in equation (12), can be represented by its corresponding incident sound pressure in time domain in equation (14) applied to patch p of wall i: h 0 sp,i (t) is impulse response of the source room at patch p of wall element i, where the direct sound is now included which arrives at 4.4 ms as shown in Figure 4. The synthesis of source room impulse responses at the surfaces of the patches is necessary for including the temporal effects of the source room, the effects of absorption of room boundaries as well as to simulate the virtual room where an equivalent real room is not present. The method to compute the room impulse response (RIR) is described in previous section.

Angle-dependent sound transmission
The sound transmission is computed by using the transmission coefficients for each path from source to the receiver room. Generally, the transmission coefficients are estimated based on diffuse sound field assumptions [9] and on taking transmission coefficients from D nT or R ij data as given in equation (1). In order to introduce a more detailed insulation prediction, including angle dependence and theory of finite panels, we extend the approach and use the idea of segmenting the individual building elements into finite sizes patches. We then compute transmission coefficients based on the angle of incidence of plane wave on these patches. Furthermore, we elaborate details of building acoustics parameters for which, normally, the spatially averaged  values are used such as vibration velocities on the surface of elements, radiation efficiencies and bending wave transmission across the junctions.

Above the critical frequency
In an example of a monolithic infinite plate, the incidence angle dependent transmission coefficient, denoted by s(h), is given in equation (15) [17,20]: Z o = q o c, is the impedance of free medium (air). The bending wave impedance of the wall Z(h) is given in equation (16) which is defined by Cremer [20] as, In equation (16), r ¼ x xc ¼ f fc with f c ðx c Þ as critical frequency (critical angular frequency), and g tot is the total loss factor of element derived from g tot = g int + g edge + 2g rad . The first term g int is internal loss factor and is normally taken as 0.01 for common homogeneous building materials according to ISO [9]. The second term g edge is the damping loss factor which can be derived from ISO 15712-1:2005 [21], and the term g rad is the single-sided radiation loss factor. By inserting the single-sided radiation efficiency r(h) for infinite plates, defined as r h [20] and equation (16) into equation (15), the angle dependent transmission coefficient s(h) for infinite plate can now be rewritten in the form of equation (17), which is a reproduced form of Cremer's equation (9.3) in [20]: This definition of the angle dependent transmission coefficient can be used to calculate sound transmission of an infinite wall with mass, stiffness, and damping for frequencies above the critical frequency. It also notable that above the critical frequency, the infinite plate formulae yield the same sound reduction index as for finite plates [22]. As we are calculating the transmission coefficients for the finite segments with rigid boundary conditions (such as windows, doors and portals) and patches in a large wall with continuous boundary conditions (patches are the small segments of large wall elements (i.e. partition) as shown in Fig. 3), therefore, radiation efficiencies must be calculated based on angle of incidence of plane wave for both resonant and forced transmissions (i.e. above and below the critical frequency) and frequency dependent total loss factors. Above the critical frequency, the maximum of sound transmission s(h) occurs at the coincidence angle h c when sin 2 h c ¼ fc f . Therefore, for the values of h which are closer or equal to h c equation (17) can be approximated to equation (18) by using h c [23]: Equation (18) can now be used for calculating angle dependent sound transmission for direct sound field for indoor and outdoor sound sources for the infinite structures, however, for the finite plates Davy [23] proposed a theory to compute the radiation efficiency for the finite plates which is used in this paper. For the diffuse sound field further approximations are adopted by substituting x ¼ cos 2 h and y ¼ x þ 1 r À 1, into equation (18) and then inserting equation (18) We can get diffuse field value s d : Equation (19) can be approximated by using integral from Gradshteyn and Ryzhik [24] and the diffuse transmission coefficients are calculated from equation (20), which is used to calculate sound transmission for the diffuse field component (especially in the case of adjacent rooms): Cremer's [20] radiation efficiency r h ð Þ ¼ 1 ffiffiffiffiffiffi ffi 1À fc f p ¼ 1 cos hc gives an infinite value at critical frequency, therefore, we use Davy's [23] theory to calculate radiation efficiencies for finite size of a pane with rigid boundary conditions as r h c ð Þ for frequencies above the critical frequency given in equation (21). The detailed procedure to calculate radiation efficiencies is given in [23]: where g = cos h c for f ! f c and 0 for f < f c . In equation (21)), h, q and a are defined in [23] (Eq. (36), Eq. (37) and Eq. (38) respectively) with other constants which are used in the calculations of these parameters. In equation (18), the slope of the sound insulation curve above the above the coincidence frequency is overestimated. This implies that with the model as presented above the critical frequency the transmission loss at oblique incidence is too high at high frequencies (above the coincidence dip). An approximate solution to the problem is proposed by Rindel [17], where forced and resonant radiation efficiencies are calculated and thereafter combined to get the final oblique transmission coefficients. This option is implemented in the model, too.

Below critical frequency
Below the critical frequency, the forced transmission is dominant for which the angle dependent sound transmission coefficient for finite elements is calculated using radiation efficiency for forced transmission. The bending stiffness, initially, is ignored by replacing r = 0 in equation (16), with Z(h) = jm2pf, then equation (15) can be rewritten in the form of equation (22). Later on bending stiffness is included by addition of resonant transmission coefficients from equation (18) or equation (20) into equation (22) as proposed in [24]: The diffuse field transmission coefficient below the critical frequency is calculated using the average diffuse field single sided radiation efficiency approach. Inserting equation (22) into ð Þ sin h dh, we get, Here r h i ¼ R p 2 0 r h ð Þ sin h dh, which is calculated by in [23] and is given as, We now can calculate angle-dependent transmission coefficient which is a function of frequency angle-dependent radiation efficiency for a finite panel with rigid boundary conditions (such as windows, doors). Above the critical frequency, the transmission coefficient, for direct sound field, is calculated by using equation (18)  ), whereas for diffuse field it is calculated by using equation (20) (with r(h c ) form equation (21). Below the critical frequency, the sound transmission coefficient, for direct sound field, is calculated as the sum of equation (18) and equation (22) (with radiation efficiency from equation (21) (with g = cos h)) and for diffuse sound field it is calculated as the sum of equation (20) and equation (23) (with radiation efficiency from equation (24).
Once the transmission coefficients for individual patches are computed, it can be proceeded towards calculating transmission coefficients for each path from source room to each patch of the receive room (flanking transmission) defined in ISO 12354 [9] and is given in equation (25): The transmission coefficient for path ij, from source to receiver room in terms of transmission coefficients of each patch of ith element of source room and each patch of jth element of the receiver room is given by equation (26): Here, s p,i and s p,j are the transmission coefficients and S p,i and S p,j are the surface areas of single patch p, on ith and jth elements of source and receiver rooms respectively, whereas, S i and S j are the surface areas of the elements. The surface area of the partition between the source and receiver rooms is denoted by S D and the vibration transmission over junction between the elements i and element j is represented by d v,ij . With the extension towards angledependent transmission coefficients, the irradiation of wall patches and the radiation from patches can be separated into a direct field and a diffuse field in more detail, as compared with previous sound insulation auralization models. For outdoor sources exciting building façades at arbitrary angles, the angle-dependent component in the transmission calculation is crucial in any case.

Sound field in receiving room
The sound power transmitted from ith element of the source room to jth element of the receiver room for direct as well as flanking paths is defined by equation (26), which is the final sound power of any radiating element j in the receiver room: Using equation (26) in equation (27) we get the expression of radiated sound power in the following form: In equation (27), the sound power W s,i is obtained by taking sum of incident sound power of all single patches from equation (12) and using in equation (28), the expression of sound power for the receiver room can be written in the following form: Now, each radiating element j, of the receiver room is represented by a set of evenly distributed point sources (i.e. secondary sources) on the patches on its surface. At this point, we can distribute the transmitted acoustic power W R,ij , radiated by element j, among these secondary sources (known as patched in the source room) homogeneously by a factor 1 P j , where P j is the total number of secondary sources on element j. The sound energy W Rp,ij , radiated by a single secondary source of wall element j, with Q Rp,j as its directivity is then calculated from equation (30) and is given as, Therefore, the mean squared sound pressure of a secondary source for path ij in the receiving room can be computed by equation (31): Using W Rp;ij from equation (30) in equation (31) we get, r p,j represents the distance between the acoustic centres of the radiating secondary source p of the wall element j to the evaluation point (position of the receiver). Finally, the time domain representation of the binaural signal at receiver point is obtained by introducing room impulse responses of the receiver room and the HRIR for each secondary source to the receiver: All h t ð Þ are statistically valid for all points inside both the source and the receiving rooms that is why h t ð Þ can be synthesized before implementing the auralization filter chain. However, the assumption can be made that h t ð Þ does not vary considerably for different positions of source and receiver, hence, h Rp,j (t) and h sp,i (t) may be computed independent of each other to avoid coherent interferences in the reverberant field coming from different radiating elements.

Sound insulation for outdoor sources (façade sound insulation)
The general outdoor sound insulation prediction model is based on the techniques described in previous work [26] and the section above. The procedure for filter design, however, should cover sound transmission loss of exterior walls, roof constructions and windows. Again, the method of segmenting the individual building elements into finite size of patches known as secondary sound sources (SS) is used, as the exterior walls of common buildings are consist of an assembly of two or more parts or surfaces (e.g. windows, etc.). ISO 12354-3 [10] provides basic guidelines for airborne sound insulation against outdoor sound. In the standard, the source position is at 45 degrees incidence angle assuming a plane wave incidence on the façade. Now, we take into consideration the direct part of the sound field hitting the surfaces of the exposed building elements (i.e. façades) at their specific angles of incidence. The direct sound transmission path (Dd) through each small segment of the elements (i.e. secondary sources) is considered because it is assumed that the transmission for each secondary source is independent from the transmission of the other [18]. Hence, we consider the angle-dependent radiation efficiency r h ð Þ, to get angle-dependent transmission coefficients.
Sound insulation filters for façades are designed based on the above presented model, where at the first place, we consider the sound source directivities. The elements may be homogeneous (e.g. a single homogeneous wall element) or consisting of an assembly of two or more parts or surfaces (e.g. doors, windows). Let us assume an outdoor source with directivity Q s , the mean squared sound pressure at any point on the external surfaces of the building elements (façade) at a distance r from the source in energetic notations is given by equation (34): The sound power at any point on the façade can be calculated with a simple modification to the stationary sound fields in the ordinary room that is to take into account the direct sound field. Therefore, under free field conditions the direct incident sound power on a secondary sound source p with a surface area of S s;p , denoted by W s;p is given by equation (11), where r s;p is the distance from the source to the infinitesimal element dS s;p on the façade secondary source and h s;p is the incidence angle of the wave. This is very similar to equation ( Thus the incident power on each secondary source of an element is calculated as, The sound power transmitted by one secondary source from source to receiver room is now calculated by using equation (35) and transmission coefficients from equation (18) (with shear wave correction [25]), for resonant transmission, whereas for forced transmission equation (21) is added with equation (18): Finally the contribution of the Dd path for a single secondary source to the mean squared pressure in the receiving room is derived by using the expression given by equation (39) with Q r;p as directivity of secondary source and A R , as the equivalent absorption area of the room: Inserting equation (37) in equation (38), sound pressure for single secondary source is given by equation (39), where S i is the area of the walls and r r;p is the distance of receiver from the secondary source: The final impulse response from source to receiver for one secondary source as radiating element is given by equation (40):

Interactive real-time auralization
Auralization makes the sound audible to the listener in the receiving room by using an appropriate equipment and reproduction techniques. In our work we emphasize that the auralization can be performed in real time for a stationary or moving person, where the person interacts with others or it performs an interactive task in the virtual scene. This way, we can employ it in Virtual Reality applications, in which the user can freely move, as depicted in Figure 5.
Having the impulse response filter h(t) calculated from the equations in the previous chapter, any input time signal sðtÞ can be used in convolution, and the output signal is obtained [4]. Generally, the building acoustical frequency range is defined by one-third-octave-bands ranging from 50 Hz to 5000 Hz. The audible frequency range of human hearing is typically from 20 Hz to 20 kHz. In signal processing terms, these quantities have to be turned into frequency spectra with a practical number of frequency lines for auralization [6]. In case of sound insulation filters, an input with 21 values is normally given in building acoustics frequency range (i.e. in one-third octave bands from 50 Hz to 5000 Hz). Hence we can obtain a frequency spectrum with 4097 spectral lines (frequency bins) by using suitable interpolation techniques, such as cubic spline interpolation. HRTFs and binaural filters in equations (33) and (40) are included to take spatial auditory effects into account. In addition to the spatial impression the presentation of building acoustical signals differ in an important point from other techniques, i.e. the relevance of loudness [6]. The colouration, spaciousness and/or the lateral fraction are important in room acoustical auralization which do not have much variation with the change of level. However, in building acoustics the level and colouration are simultaneously the most important quantity. Therefore, during the reproduction of both the correct absolute level and the relative level between source and receiving rooms the care has to be taken into account for not wasting valuable signal to noise ratio in the signal chain. If the absolute level of the sound signals are to be reproduced, a calibration of the replay chain has to be done, see also [7,11]. Examples of real-time auralizations are given in supplementary material at YouTube 4 and ITA Website 5 .

Results and discussion
The extended model is validated for a workplace for which two adjacent rectangular rooms are selected as source and receiving rooms. To compare the results for façade sound insulation, a corner room is selected as receiving room with exterior walls as composite walls (i.e. an assembly of different materials) which consist of glass windows.

Performance of the Algorithm
The real-time sound insulation filtering processes are evaluated in terms of computational costs for each step involved. The latencies are calculated for main algorithm that are involved in offline (initialization or pre-process) and real-time computations. All computations were performed on desktop personal computer featuring an Intel Core i7-7700 CPU @ 3.60 GHz multi-core with 16GB RAM, Windows 7 (64-bit) operating system. Figure 6 shows the latencies calculated for three main processes, 1) source room updates (i.e. impulse response, transmission coefficients, directivity and position of the source), 2) receiver room updates (i.e. impulse response, HRTF and position of the receiver) and 3) building acoustic filters updates (simultaneous source and receiver room updates, and transmission coefficients). In case of adjacent room simulations, the direct partition and flanking walls are segmented into four number of patches (2 Â 2). For outdoor case each window (total four) is taken independent patch element. The building acoustic filters update process includes both the source and receiver updates simultaneously. The pre-process includes virtual geometry handling, sound insulation metric calculations and room impulse response synthesis which we need for initialization of auralization properties and the dimensions are not changed during the auralization. The second process is performed in real time which starts with computation of energies hitting at the surfaces of each element i of source room that may vary with the change in position and orientation of the source. The next step is updating the source room impulse responses for each element i and the receiver room impulse responses for each secondary source, and handling of multiple secondary sources with HRTF due to the change in receiver orientation. As the latencies are below the typical threshold of 50 ms [4], free movement of the source (e.g. road, rail or air vehicles) and of the receiver (free movement of the human listener in the virtual receiving room) is validated to be possible without violating real-time performance in interactive Virtual Reality applications.

Comparison of standardized level difference D nT (adjacent rooms)
The purpose of this comparison is to primarily validate the extended method in compliance with the standard prediction models. The indoor scene has source and receiving rooms with dimensions 4 Â 4.8 Â 3 m 3 , as shown in Figure 4. The main partition between the offices is 3 Â 5 m concrete wall with thickness of 120 mm, density 2200 kg m 3 and the internal loss factor is 0.005 [9]. To reproduce the results from extended approach, the D nT values are obtained from the simulated sound pressure values for five different source and receiver positions including the normalization to the reverberation time of the receiving room. This can be interpreted as "virtual measurement" following standard settings [27]. Figure 7 shows the computed D nT values in one-third octave band which are averaged over five random source positions and five random receiver positions. In the same figure, the predicted D nT results following ISO 12354-1 (i.e. based on transmission coefficient s) are compared with that of the extended approach D nT results. The differences between D nT values of both ISO and extended approach are also shown. From Figure 7, as we can see that the extended model results are in good agreement with that computed from ISO (i.e. diffuse field approximations) in the case of adjacent rooms. The maximum difference between D nT values of both approaches is below 1.9 dB. Furthermore,  in Figures 8 and 9 it is compared the actual sound insulation of non-standard settings in a real building situation.
This can be any condition outside the prerequisites for the definition of sound insulation, such as 1.5 m distance between source and receiver positions and room boundaries or source directivities. For this we take two cases as nonstandard settings (i.e. source/receiver configurations), where the source is modelled as a HiFi stereo sound system with typical loudspeakers' directivity, pointing to the centre of the source room. In first configuration, the system is placed 0.3 m away from the flanking walls and in the second configuration it is placed 0.3 m away from one of the partition wall pointing towards centre of the room. The receiver is placed at three random positions in the receiving room.
The resulting D nT curves for both configurations (Figure 8 and Figure 9) of the system are compared with the standard D nT values determined for the same sound power but omnidirectional radiations from a standard source position which is greater than 1.5 m from the source room boundaries, which we referred as "virtual measurements" in Figure 7 (red colour plot: D nT (standard average)). Differences of up to 5 dB are observed in D nT values of three receiver positions, which are caused by proximity to the room boundaries and source directivities.

Comparison of standardized level difference D nT
(outdoor sources) Figure 10 compares the D nT values of extended approach and diffuse field approximation (ISO) presented for façade sound insulation of an office room with dimensions of 6.5 Â 4 Â 3 m 3 . The selected external wall (i.e. façade) of this office is an assembly of different materials and consists of glass windows connected through concrete pillars as shown in Figure 5.
The height and width of each glass window are 2.5 m and 1 m respectively. The glass thickness is 8 mm, density is 2500 kg m 3 , and the internal loss factor is 0:004 (0.003 to 0.006 [22]). Each window is a secondary sound source for the receiving room and the sound insulation for each secondary source is computed independently as finite segment. In this way, the façade acts as an assembly of multiple secondary sound sources which radiate sound energy to the receiver room. It is assumed that the sound transmission of each secondary source is independent from the sound   transmissions of others and have no interaction with each other in terms of transmission of bending waves across them. The flanking walls are concrete walls in the same setting as in section 5.2 above. Figure 10 compares the D nT values and its differences ÁD nT in dB, computed from diffuse field approximation (ISO 12354-3) and from the extended approach with the standard setting in "virtual measurement". The procedure to obtain the D nT values as "virtual measurement" at exactly 45 incident angle on the façade the condition of a plane wave incident is fulfilled by placing source at a very large distance (500 m) to the façade and the outdoor sound level is obtained at façade (L 1 ). The indoor sound level (L 2 ) from extended approach is calculated by taking the average sound pressure level for five random positions in the receiving room from all secondary sources (i.e. windows). The reference outdoor D nT ;45 referred in blue curve of the Figure 10, is calculated based on [10]. The differences between extended approach and ISO standard D nT values are also shown in Figure 10, which are in the order of magnitude of 0:6 dB in average with a maximum of 9 dB around 3150 Hz.
Once the D nT values from the model are validated and are found in good agreement with standard (ISO) results, we create three example cases with non-standard settings of an outdoor sound source to evaluate the effects of source positions and orientations in front of the façade. This evaluation is performed to realize how big these changes are in effective D nT values in the receiving room in case of moving outdoor directional sources.
In first example case, the sound source is taken as omnidirectional source (without considering its directivity), while for second and third example cases the sound source is taken as directional source. Figure 11, compares the results of D nT values for an omnidirectional source placed at four different positions. Figure 12 and Figure 13 compare the results for second and third cases configuration settings such that in Figure 12 a directional sound source is placed at four different positions facing towards perpendicular direction (at 90°) from the façade, while in Figure 13 the source is placed at the same positions, however, facing towards the façade. As an example, a typical source directivity of loudspeaker is selected for the directional sound source as shown in Figure 2.

Sound fields (receiver room) from outdoor excitation
In previous section the D nT values were compared for different source configurations and non-standard settings. As concerns the sound transmission as such, the source directivity does not play a large role. The main difference is due to the angle-dependent transmission coefficients for different positions in all three cases. However, from interactive and real-time auralization perspective the source power and orientations are required as reference rather than the incident intensity on the façade. Finally, the spatial distribution of the sound pressure level in the receiving room for excitation from outdoor sources is visualized. Three random positions of the sound source are selected. As expected, the outdoor sources have more influence on variation of angle-dependent incident sound power on the building elements and in consequence, on the different amount of energy transmitted through direct path as shown in Figure 14.

Conclusion
In this paper, a model for real-time auralization of sound insulation is introduced by taking into account the source as well as receiving room acoustics with more detailed information. The results of spatial variation of sound pressure inside the receiver room are presented using the knowledge of sound propagation theory in closed spaces for indoor and outdoor cases. Room impulse responses are synthesized from one-third octave band values of the reverberation times and the mean free path in order to incorporate the reverberation effects in accordance with the absorption and geometries of the rooms. Therefore, the aim has been to develop a model that produces more realistic loudness, colouration and binaural impression of the sound transmission at the receiving end by the sound source directivities and source and receiver positions in real-time also for interactive Virtual Reality scenes. However this has not yet been evaluated in subjective experiments. In addition, considering building elements as secondary sources might be helpful to include a more realistic directional cue of sound sources. Although the model presented here is rather detailed in terms of structural-acoustics input data, the claim of the model is not to predict all kinds of existing building elements. In any case, measured sound transmission coefficients from test facilities may serve as input, so that existing (real) building situations can be simulated and compared with measurements. As the model and its open source software is open for any kind of input data, improvements and extensions for more construction types can be implemented easily. The only missing feature in existing standard prediction models (ISO) is the angle dependence, for which a specific solution for monolithic elements is used.
The results of real-time performance of the algorithm in terms of latencies make it possible to render the sound insulation modelled by secondary sources. The results of sound transmission into the receiving room based on diffuse sound field assumptions are compared with that obtained from this model. Under conditions which match the measurement standards of sound insulation testing, the results of the auralization could be validated to differ not more than on average 0.6 dB and 0.3 dB for outdoor and indoor source positions respectively. It is shown that in the results of the extended model, the source directivity and position have an influence on the transmitted energy to the receiving room and, thus, in turn the spatial variation of sound pressure level is more specifically related to the actual scenario and more valid when it comes to auralization. This fact is more obvious in the case of outdoor sound propagating to the receiver room through façades, where we can see that the secondary sources which are more exposed to incident sound field transmit more energy to the receiver room. Furthermore, along this work it is presented a complete building acoustics auralization software framework integrated with virtual reality using audio-visual technology for full open access. The final goal is to provide a VR system for research, consulting and psychoacoustic assessment of the