Open Access
Issue
Acta Acust.
Volume 4, Number 2, 2020
Article Number 5
Number of page(s) 14
Section Room Acoustics
DOI https://doi.org/10.1051/aacus/2020006
Published online 12 May 2020

© T. Pham Vu and H. Lissek, Published by EDP Sciences, 2020

Licence Creative Commons
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

In room acoustics, sound field reconstruction generally consists of retrieving the entire enclosed sound field by performing a limited number of measurements. While the interpolations of the room impulse responses (RIRs) are commonly used for the purpose of auralization and sound reproduction, at low frequencies, a precise knowledge of its frequency domain equivalent – the room frequency responses (RFRs) – can provide useful information on the spatial distribution of sound pressure caused by the resonances of the room (room modes) [1]. In the low-frequency range, room modes highly affect the sound field in the room, yielding irregularities in both the spatial and frequency domains which give rise to coloration as well as masking effects and eventually alter the listening experience. An accurate depiction of the spatial sound field in a room can provide important information for applying ad hoc treatments for room mode correction [2]. It has been shown that, at low frequencies, a knowledge on the modal properties and sound pressure distribution in the room helps improving the design of different passive corrective measures [35]. This becomes even more crucial in case of active strategies for room modes correction [68] where control settings could be adjusted based on the knowledge of the resulting sound pressure distribution. This highlights the need of a practical method to accurately reconstruct the sound field in the room at low frequencies.

Each RFR reveals the acoustic transfer from a given source to a given receiver in the room, in the frequency domain. Such RFRs embed the main properties of room modes, namely the resonance frequencies and modal decay times as well as the mode shapes of the room. To retrieve these information for a fixed source position, multiple measurements should be performed at different locations in the room and a reconstruction framework is required to recover the entire spatial information of the aforementioned quantities. The most intriguing question is how to faithfully reconstruct the spatial sound field in a room using the least number of measurements possible.

A regular space and time sampling of the RFRs generally results in an impractically dense microphones grid. It has been shown that, under the frame of the Plenacoustic function in free field [9], the inherent sparsity of the space–time representation of the governing function allows a more effective sampling approach of the sound field. Several studies have also addressed the different sparse properties of enclosed sound fields. In a room with closed boundaries, the sound field is fully dependent on the physics of the room, including its geometries and acoustic properties. Furthermore, at low frequencies, the wave equation is governed by a discrete number of eigenmodes which gives rise to additional sparse approximation. In [10], the spatial RFRs in a rectangular room can be interpolated on a line based on the fact that these transfer functions share the same common poles, with the only difference being their amplitudes (also known as residues) [11, 12]. Mignot et al. [13] retrieved the low frequency RIRs in a rectangular room using a finite number of measurement points, by exploiting a low rank approximation using matching pursuit. In [14], a more conventional Compressed Sensing technique using a sensing matrix has been used in combination with plane waves expansion techniques to tackle the block-sparse properties of the acoustic field in a rectangular room.

In this paper, we focus on these inherent sparse properties in room acoustics at low frequencies using approximation techniques such as matching pursuit and global curve-fitting to obtain the low-frequency information of a non-rectangular room under an extensive point of view, where the spatial distribution of sound pressure in a large volume inside the room can be reconstructed and analyzed using a practically small number of microphones. In practice, not every room can be considered as a rectangular room, especially in the case of a conventional listening room or private cinema. Non-rectangular rooms certainly possess a more complex distribution of eigenmodes frequency-wise, and the mode shapes are also harder to predict. This practical challenge is the main motivation to investigate here a model of a non-rectangular reverberation chamber. A first numerical study of this facility is then followed by the experimental validation inside the actual reverberation chamber.

The analysis of the reconstruction results emphasizes on the frequency and spatial aspects of the responses in the room. As can be seen in [1517], recent techniques for room modes equalization require an accurate knowledge of the sound field. For instance, an active electroacoustic absorber system [17], aiming at equalizing and flattening the frequency response of a room at low frequencies, requires an accurate model of the room to optimize the active acoustic impedance. In former studies, the actual efficiency of these low-frequency absorbers has been validated with a limited number of measurements inside the room, especially addressing the performance in terms of modal decay times reduction. With the possible help of the reconstruction framework proposed in this paper, the performance of the absorbers can be assessed space-wise. In addition, the framework can also provide precious information on how to adapt the acoustic impedance to be assigned at the diaphragm of the active electroacoustic absorbers. This motivates investigating how to minimize the number of measurement points for such a reconstruction framework as not only that it allows the reconstruction of the sound field within a specific bandwidth with limited equipment but also saves processing time that will eventually allow potential real time and online active-control strategies.

The outline of the paper is as follows. Section 2 first introduces a sparse representation of room acoustics at low frequencies. The reconstruction method, which is composed of two steps, is then introduced in Section 3. The first part of the method consists of the modal identification of the room in which two different approaches, respectively in the time and frequency domains, are suggested. The second part aims at recovering mode shape functions through plane wave approximation techniques. Following the descriptions of the reconstruction mechanism, Section 4 is dedicated to the validation of the method using both numerical models and experimental measurements in the actual reverberation chamber at EPFL (non-rectangular room) to emphasize the robustness of the algorithm. Several discussions are raised concerning the accuracy of the sound field spatial recovery as well as the requirements for a faithful reconstruction. Concluding remarks are finally presented in Section 5.

2 Sparsity in room acoustics

The main motivation of this study is to propose a simple, yet practical, experimental framework allowing a thorough characterization of the room behavior in the low frequency domain. Regardless of the method used to reduce the amount of measurement points, such a framework should rely on a sparse representation of the wave equation in a room at low frequencies. These could be exact sparsity that inherently emerges from the physics of the room or approximate sparsity which requires an approximation framework to reduce the degrees of freedom in the wave equation. In this section, several sparse aspects of room acoustics at low frequencies can be investigated using the modal decomposition form of the wave equation and the mode shape approximation theorem. The objective is to obtain a governing equation of the spatial distribution of sound pressure in a room where the number of variables is well defined and quantifiable. This could serve as the target for the reconstruction framework that follows in Section 3.

2.1 Modal decomposition

At low frequencies, where wavelengths are of the same order of magnitudes as the room dimensions, room walls are mostly reflective which give rise to standing waves phenomenon. This creates the so called room modes that occur at discrete resonance frequencies where most of the acoustic energy is concentrated [18]. There exists a formulation of room modes at low frequencies that presents an inherent sparsity, corresponding to a limited number of discrete resonance frequencies bounded by the Schroeder cutoff frequency [1]. In this sparse representation, the solutions of the wave equation can be decomposed as a discrete sum of damped harmonic eigenmodes:(1)where are the space-dependent mode shape functions (eigenfunctions of the Helmholtz equation) for each mode n of the room, is the position in the room, is the harmonic time-dependent decaying function and is the corresponding complex expansion coefficient of mode n. Each eigenmode of a room is uniquely represented by a complex wavenumber (eigenvalues of the Helmholtz equation), where is the sound celerity in the air, is the modal angular frequency and is the corresponding damping factor [18]. The harmonic decaying function can be fully expressed as:(2)

It is worth noticing that while is a variable in equation (1) as the location of the point/microphone of interest, the location of the source and its properties are not explicitly written here. This information is however accounted for in the complex coefficients , and will be made implicit in the following derivations. This is motivated by the fact that, in the case investigated here, only a single fixed source will be considered, and hence the location of the source is not a variable.

2.2 Mode shape approximation

The previous derivation introduced a structured sparsity originated from the limited discrete modal decomposition of the wave equation at low frequencies. For a room with ideally rigid walls, is a space dependent function that corresponds to the exact solution of the Helmholtz equation [1]:(3)

It has been shown in [19] that these mode shape functions can be further approximated with spherical harmonics and spherical Bessel functions. Accordingly, any mode shape function can be approximated by a finite sum of plane waves sharing the same wavenumber , pointing in various directions. Each individual mode shape can then be formulated using the R-th order approximation:(4)within which are the 3D wavevectors sharing the same wavenumber . Note that, in opposition to the exact sparsity in the previous section, this is an approximate sparsity. This decomposition not only provides an approximation for each of the mode shape, but also allows a closed-form interpretation of the mode shape function regardless of the type of the modes in the room. Assuming now that we restrict this representation below a given upper frequency limit, a finite number R of wavevectors would be enough to closely approximate every mode shape function within this frequency range. Using equations (2) and (4), equation (1) could be expanded as:(5)where with . Hence, through a series of derivations, the expression in equation (1) can be interpreted as the discrete sum of space-time damped harmonics with the expansion coefficients . This expansion form directly links the acoustic response of the receiver to its location.

3 Reconstruction framework

The role of the reconstruction framework is to identify and estimate the values of the unknown parameters of equation (5) from a limited set of measurements. The proposed algorithm addresses the general case of a non-rectangular room, the modal behavior of which is less predictable than in a shoe-box room. Figure 1 shows the geometry of the studied room with two simulated room modes. Inside the room, a number of M microphones are randomly placed at different locations to acquire the RIR measurements. Depending on the frequency range of interest, these measurements could be filtered as well as downsampled to reduce computational cost. Calling the length of the time vector of each microphone measurement, the () matrix S of signals is defined as the input of the framework. The output of the reconstruction framework, in short, should be all the unknowns present in equation (5), excluding the predefined parameters, namely, the number of modes N and the list of wavevectors for each mode shape approximation. The outputs, hence, include the angular frequency and the exponential damping factor for each eigenmode, as well as the N × R expansion coefficients . Once all these values are determined, it is possible to interpolate the responses at any position in the room by simply plugging it into equation (5).

thumbnail Figure 1

Examples of room modes in a non-rectangular room.

The detailed framework can be divided into two steps. The first one is called modal identification, aiming at estimating the modal wavenumbers for the N room modes. Once identified, the second step intends to approximate the expansion coefficients for a set of predefined wavevectors through projection.

3.1 Modal identification

Two alternative approaches are introduced here, processing the input signals either in the time or in the frequency domain. The first approach is the simultaneous orthogonal matching pursuit (SOMP) method [20] for damped sinusoids [21]. This method is based on a greedy algorithm approach to recursively estimate each modal parameter of the room from the matrix of input time signals. The second method is based on the rational fraction polynomials (RFP) global curve fitting method [22] which, contrarily to the iterative SOMP, simultaneously estimates the modal parameters of the room from a set of input RFRs of the room.

3.1.1 Time domain approach

This method has been successfully used in [13] to locally interpolate the RIRs in a rectangular room at low frequencies. From a pre-defined set of damped sinusoids, this method finds the ones that are highly correlated with the matrix of input signals using a low-rank approximation approach. To begin, two sets of ω and δ with and , are formed. The range of variation of the sets are roughly estimated based on available knowledge on the room. Combining every pair of entries of the two sets together will produce an overly redundant set of complex components in which with Q as the total number of possible combinations. Each entry of this set is then used to form a time vector of length of time-decaying damped sinusoid . Using the normalized vectors as column vectors will produce an array .

The algorithm performs an iterative matching procedure. Every loop indexed i starts with an residue matrix which is the result of the previous loop. At the first loop, is set to be equal to the predefined signal matrix S. Through the searching procedure, a damped sinusoid with the highest correlation to the residue matrix (representing a pair of and ) is chosen. The new residue matrix for the following loop can then be formed by extracting the contribution of this chosen sinusoid from . The algorithm at a generic i-th iteration is detailed below:

  • Process the (Q × M) correlation matrix . Each row q of is composed of the M correlation values between the q-th normalized damped sinusoid and each of the M measurements.

  • By summing the energy of this set of values, process the evaluation correlation value between the qth damped sinusoid and the entire set of measurements: .

  • Out of the Q available , choose the maximum one, which points to the pole with the highest correlation to the measurements.

  • As a result, the identified index (namely, ) yields the chosen modal wavenumber of this loop: .

  • After a modal wavenumber is found, following the orthogonalization and projection of SOMP in [20], the residue matrix for the next loop can be interpreted as in which is the projection onto the chosen damped sinusoidal.

  • Repeat with until .

At the end of the procedure, a group of complex wavenumbers corresponding to the eigenmodes of the room is determined.

3.1.2 Frequency domain approach

As room modes are mostly visible in the RFRs, it seems pragmatical to investigate a frequency-domain approach for room modes identification. One particular example is the global curve-fitting method in the frequency domain using the RFP form [22]. This has been used in [23] to estimate the modal parameters by curve fitting the RFR measurements. Curve-fitting methods are usually processed locally, initiating on a single function at a time. The method in [22], however, performs curve-fitting procedures on multiple frequency response functions at different locations simultaneously to identify the model of the system. The method assumes the linearity of the RFRs and that they can be formulated as a ratio of two polynomials. These RFRs share the same denominator whose poles contain information on the modal angular frequencies () and damping () of the room. The method then performs a concurrently curve-fitting on the set of measured RFRs (see Appendix) to acquire the modal parameters of the room within a given bandwidth.

3.2 Projection onto spherical sampled wavevectors

Up to now, only the eigenfrequency parameters of the room modes, namely, and given in equation (5) have been identified. The remaining parameters to be determined are the expansion coefficients , for which the following algorithm is used:

  • The first step is the separation of the current known and unknown parameters. Note that the time-varying terms in equation (1) have been identified in the former algorithm, and can be discarded from now on. Using a matrical form accordingly to the measurement matrix S gives: with G as the matrix where each of its row is a modal damped sinusoidal . Furthermore, is the () space-dependent matrix of modes with the inclusion of the expansion coefficients that appear in equation (1):

(6)
  • with ’s the M position vectors for the location of the input measurements of S.

  • If (which usually is the case), the system of equation (6) is over-determined with unknown and equations. Hence, it is possible to estimate the () matrix of by computing the least-squares estimation:

(7)

  • Based on the expression in equation (5), can be further expanded using plane waves expansion.

    • First, the list of component wavevectors needs to be defined. For each mode shape function, a set of R wavevectors is created whose norm and directions match a uniform sampling over a sphere with radius . Spherical sampling (proposed in [24]) is chosen in this case because the room is non-rectangular and hence there is no preferred basis for the formation of mode shape functions.

    • Each column of the matrix can be treated individually as they are associated with different modes. Calling the () matrix of the plane wave harmonics for mode n in which , each column vector can be individually characterized as:

(8)
  • with the () vector consisting of the R expansion coefficients of mode n. First, assuming that , taking as the basis, can be projected onto this basis to derive the coefficient vector using least-square projection:

(9)
  • As mentioned above, this derivation is only available when the number of sampled plane waves is lower than the number of microphones. As can be seen in [19, 25, 26], the convergence of the plane wave approximation is highly dependent on the number of plane waves available, especially in 3D. Hence, in the case where the number of measurement points is fairly low, restricting could affect the reconstruction of mode shape functions. One possibility would be to allow and derive the coefficient vector using a least norm optimization:

(10)
  • Further studies need to be done to verify the limitations of this solution as well as the optimal choice for R. In our case, for a low number of microphones, several trials have shown that choosing can estimate the mode shape better and increase the overall correlation. Regardless of the method used, in practice, the applicability of this step can always be cross-checked using a number of evaluation microphones.

  • Repeating the technique on each mode will return the set of expansion coefficients required for the reconstruction.

4 Reconstruction results

In this section, the results of the sound field reconstruction framework are analyzed using both numerical and experimental data. In the numerical simulations, a FEM model of a non-rectangular room is built for initial analysis. This first numerical study allows the assessment of multiple aspects of the reconstruction framework. It provides access to a very fine distribution of microphone placements, and the input data, such as wall impedance, can be changed straightforwardly. Furthermore, the FEM simulation not only provides the input but also can be used as the ground-truth reference for cross-checking the reconstruction results. In the second step, measurements are performed in the actual reverberation chamber at Ecole Polytechnique Fédérale de Lausanne (EPFL), with the same geometry as the one considered in the simulations, to confirm the validity and robustness of the framework.

4.1 Numerical simulation

The FEM model consists of a non-rectangular room with maximum height of 4.6 m, maximum width of 9.8 m and maximum length of 6.6 m that replicates the actual reverberation chamber at EPFL. The damping of the walls are initially considered very low, with a uniform absorption coefficient of , approaching that of the actual reverberation chamber. The source is chosen to be a monopole point source and is put in proximity to one corner of the room in order to excite all the modes of the room. The measurement points are spread randomly in the room (refer to Fig. 2). This placement of microphones, although possibly not the most ideal placement strategy for a given geometry, is guaranteed to capture enough information about the sound field and its modal properties assuming there is no readily available knowledge about the room.

thumbnail Figure 2

Geometry of the FEM model. The black dots represent the measurement points that are spread randomly in the room.

4.2 Modal identification

In this section, the modal identification is performed using two different methods, namely SOMP and RFP, and their results are compared with each other. However, instead of directly comparing the retrieved modal properties and , the focus has been put on two other useful properties in modal analysis: the eigenfrequency () and the modal decay time () which is defined as [27, 28]:(11)

These two properties reflect the modal properties of the room and are directly linked to and . Using the same number of microphones, the modal decay times estimated from SOMP and RFP methods for the first 12 modes of the room are compared with those computed from the baseline FEM analysis considered as the ground-truth (see Fig. 3). After a few initial tests, it is observed that both methods performed equally well in identifying the frequency (in Hz) for each of the modes of the room. As the values of obtained using the two methods do not present much differences, the comparison is illustrated here only in terms of the modal decay times to compare their performance with respect to modal damping estimation. Using the numerical results from the FEM analysis as the reference, it can be seen in Figure 3 that both the RFP and SOMP methods are capable of identifying the room’s eigenmodes, except that SOMP, on average, may underestimate the damping for the mode at 40.5 Hz, which will be discussed at a later stage.

thumbnail Figure 3

Modal decay times for the eigenfrequencies of the non-rectangular reverberant room as estimated by the SOMP and RFP methods, in comparison with the FEM analysis (reference).

Generally, it can be observed that the RFP method performs slightly better than SOMP. However, there are significant differences between the two methods regarding robustness. Although both methods require a manual input regarding the total number of modes in a limited bandwidth, they process this information differently. For the global curve fitting using RFP, if the total number of modes within a frequency range is not accurately known, it requires a considerable amount of trials and errors to eventually come up with a coherent curve-fitting result. Furthermore, as can be seen in the later stage, without a meticulous consistency check, the interpolation results from RFP can end up with a higher amount of errors. This vulnerability, for most cases, does not exist for SOMP. This is due to the fact that the modal parameters are found in RFP simultaneously whereas in SOMP they are found iteratively using a residual manner: the room modes that have the highest contributions to the collected signals are estimated first, followed by the ones with less. This gives SOMP an advantage for the reconstruction procedure as the results do not deviate much from reality even when underestimating or overestimating the number of modes within the frequency range of interest. The number that users enter can only alter how many times the algorithm is repeated but should not affect the result of each individual loop.

In this particular case, the underestimated damping by SOMP that sometimes occurs at 40.5 Hz also comes from the fact that this algorithm processes residues at each computing step. The modes that are found at the later iterations of the algorithm are prone to higher errors and also, its correlation with the measurements is likely to be less than the ones that come before in the algorithm. When there are two modes that are very close together such as the particular cases at 40.5 Hz and 40.9 Hz respectively, depending on the set of input measurements, one of them may be found at the very far end of the algorithm compared to the other. Since one mode has been found earlier in the process, and its contribution to the residual has been extracted before, the error that occurs at the others would have minimal effects to the overall reconstruction result in the next stage. The same situation also occurs when users overestimate the number of modes. Then, around the final loops, the algorithm will certainly find some frequencies that do not correspond to any mode. As long as the overestimation is not too far from reality, this error in SOMP would have negligible effects on the reconstruction results in the next stage because the contributions of the few mismatched modes are generally significantly small compared to the correct ones.

Although being less robust, the RFP curve fitting method does have a clear advantage over the SOMP method regarding computational cost. SOMP not only performs an iterative mode finding process that requires a regular refreshing of the residual matrix but also does so using multiple costly matrix operations. The RFP method developed in [29] on the other hand, does not perform an iterative process and has taken into account several computational simplifications. For instance, on a conventional work station with 32GB RAM and four cores CPU of 3.4GHZ, in order to perform the results in Figure 3 using 25 microphones, the SOMP method usually would usually take 4–7 min to finish while the RFP would finish in 5–10 s. This significant difference will further increase if the number of input measurements or the number of modes increases.

Overall, it can be seen that SOMP is a robust method that works best in cases where not much information about the room is available or where a blind estimation is required. RFP, on the other hand, requires more a priori information about the modes in the room to produce a coherent result. However, RFP generally takes much less processing time than SOMP and hence can potentially be beneficial in certain application such as online estimation or real-time sound field control. In terms of accuracy, it should be noted that under sufficient conditions, both methods are capable of producing a good estimation of the modal information of the room.

4.3 Local interpolation

From the outcomes of the algorithm, it is now possible to process and interpolate the responses at any point inside the geometry. The RFRs correspond to the transmissions between the source volume flow rate (in m3/s) and the sound pressure (in Pa) acquired at the measurement points. One example can be seen in Figure 4 for an arbitrary point far from the walls but also not too close to the center of the room. The interpolation was processed using both the SOMP and RFP method with the same set of 25 microphone positions in the room. It can be seen that both methods can produce an accurate interpolation of the response at this particular point.

thumbnail Figure 4

Reconstruction of the RFR at a point inside the room using SOMP and RFP in comparison with the FEM ground truth reference.

To illustrate the difference between RFP and SOMP, one example of interpolation is plotted in Figure 5 where the total number of modes were underestimated. As an iterative process, SOMP would still give a good estimation of all the modes except the ones it does not find whereas RFP would add some modes that are not from the real system and hence will lead to higher errors. However, testing a few different trials for RFP can solve this problem and hence this method should not be overlooked as its computation time is short and can therefore be advantageous in many situations.

thumbnail Figure 5

Comparison between the RFRs interpolated with RFP (top) and with SOMP (bottom), when underestimating the number of modes, at a given virtual microphone position. The black curve represents the reference given by FEM simulation.

It must be noted, however, that the high level of accuracy seen in Figure 4 from both method is not yet guaranteed for every interpolated point in the room, and the error might be higher depending on the position of the point as well as on the precision of the modal identification results. This, once again, highlights the need for a spatial representation of the sound field to confirm the global validity of the algorithm.

4.4 Sound field reconstruction

In this section, the interpolation process is extended to a large number of points inside the room to acquire a series of processed time responses of the room. The RFRs of the room can then be produced through the Fourier transform of these time responses. These resulting RFRs will allow the reconstruction of the spatial response of the room at any given frequency of interest.

It is known that, for a room with non-ideally rigid boundaries (), the Helmholtz equation is less valid close to the room walls [1]. Hence, the initial sound field reconstruction is performed for a shoe-box volume inside the room with each face being at least 1m away from the walls of the room. It is then possible to compare these results with frequency domain simulations, obtained using an FEM software, considered as the ground truth. Figure 6 shows three examples of the sound field reconstruction using 25 microphones, compared to such reference, at three different frequencies at very high spatial resolution. It can be observed that the reconstruction of the sound field yields qualitatively highly accurate results. The existence of the mode shapes is also clearly observed in all three examples. This proves that the spherical sampling technique for wave vectors is a powerful tool for rooms with complex geometries. Furthermore, this high level of accuracy is maintained in every direction of the 3D depiction since the input measurement points are spread randomly in the room. A few initial trials using a regular grid of microphones have not achieved such global precision in the results. This, once again, emphasizes the advantage of the much-recommended randomness that is used in common sparse and low-rank approximation frameworks. It is worth noticing that although there can be small differences when comparing the local sound pressure point by point, the general shapes as well as the separation between areas of high and low sound pressure are nevertheless precisely depicted. Furthermore the reconstruction of sound pressure field is accurate not just at the eigenfrequencies (45.25 Hz and 55.08 Hz) but also for frequencies in between two consecutive modes (e.g., at 38 Hz).

thumbnail Figure 6

Sound field reconstruction (bottom) at different frequencies for a rectangular area inside the room in comparison with the referencing sound fields from numerical simulation (top).

The normalized Pearson correlation coefficient for the amplitude of the frequency responses, calculated as below:(12)can be used to evaluate the overall accuracy of the reconstructed frequency response with respect to the reference response . Processing this coefficient to the regular grid of 11 × 11 × 11 points that samples the inner rectangular volume, yields an average correlation of 99.3% with a standard deviation of 0.8%. It is worth noting that is a good indication of the overall fitting of the reconstructed signals in a bandwidth, but does not provide accurate clues for interpreting the precision frequency-wise. A global error evaluation will be introduced further in the section to address this subject.

So far, the analysis has shown good results for the sound field reconstruction of a lightly damped room with . In order to further assess its robustness in more conventional situations with acoustic treatments, the algorithm is tested with various room absorption condition. To verify this, a uniform absorption coefficient α is considered for the room walls, and set first at 0.1 and then increased to 0.3 to better represent a case of a damped room. Using the same number of microphones, the reconstruction of the sound field for these cases is performed as in the preceding case. Figure 7 shows the comparison of the reconstruction at the same room mode (but slightly different eigenfrequencies due to the resulting change of modal damping) between different values of wall absorption. The reconstruction results for these cases still present a good agreement with the reference ones. Not only that the framework captures correctly the reduction in terms of energy in the room but it also succeeds in rendering the smoothing effect of the spatial distribution as the room becomes more damped. Figure 8 shows a global comparison of the dimensionless normalized errors defined as:(13)which quantifies the error of the reconstruction result at each frequency, normalized by to discard the dependence on the acoustic energy difference in the room between different room absorption conditions. This quantity is then relevant for comparing the performance of the reconstruction as a function of the room damping because it accounts for the acoustic energy not absorbed by the room, which, for the same reference source, decreases as the room gets more damped. The non-normalized error is computed for each recovery point in a 11 × 11 × 11 points grid that spatially samples the aforementioned shoe-box test volume. Averaging the error within the spatial grid will then give the average error at each each frequency . Figure 8 shows a decrease in terms of accuracy as the damping of the walls increases. This is explainable as the orthogonality assumption of the mode shape functions in equation (3) becomes weaker with higher damping in the room. Furthermore, the modal identification on the RFRs is generally more challenging in a room with high damping than in a lightly damped room. Regarding the correlation, the average is still high at 98.3% with 1.8% of standard deviation () and 98.1% in average with 2.1% of standard deviation (). Figure 8 also shows the error on the reconstruction sound field for a plane very close to a wall (maximum distance from the wall is 0.1 m). It can be observed that the sound field reconstruction near the wall induces higher errors, as can be anticipated. This is due to the aforementioned non-orthogonality of the mode shape function as well as the higher errors induced by extrapolation instead of interpolation as the concerned point is mostly outside of the microphones domain. The results from this evaluation are especially meaningful for modal equalization. It shows that this particular reconstruction framework can be used to effectively assess the sound field within a room before and after a given equalization method has been applied, which paves the way for a new tool for assessing the in situ performance of low-frequency room modes treatments [7] space-wise.

thumbnail Figure 7

Sound field reconstruction (bottom) compared to the reference (top) for different cases of wall damping (α = 0.01, 0.10, and 0.30 from left to right) at the eigenfrequency around 35.3 Hz.

thumbnail Figure 8

Comparison of the normalized error between different cases of room absorption for the reconstruction of the rectangular volume and a plane near a wall of the room.

Figure 8 also shows that the reconstruction error generally increases for higher frequencies. As the algorithm does not particularly favor lower-order modes over the higher ones, this indicates that something over the parameters estimation step affects the accuracy level. The first possible answer appears to be the complexity of the mode shape function. For a room with complex geometry, the complexity of the mode shape functions will also increase for higher-order room modes which will generally require a higher number of plane waves to converge. Even when using a least norm method to increase the number of plane waves, the compromise between regularization and instability of the mode shape approximation [30, 31] means that the accuracy still relies heavily on the number of measurements available. Furthermore, as the frequency gets higher, the modal density will increase which means that the average distance (in Hz) between two consecutive modes will be smaller and induce more difficulties for the modal estimation framework.

So far, the number of measurement points (microphones) has not been mentioned. Figure 9 compares the Pearson Correlation criteria (space-wise average and standard deviation) processed for different number of evaluation microphones, and for different absorption coefficients of the walls. For each case, the algorithm is repeated multiple times with the same number of measurement points but each time the locations of the input measurements are chosen randomly from a set of 600 random points. This procedure is chosen so as to eliminate the bias that could emanate from the placement of the microphones, especially in the cases where the number of microphones is considerably low. For each case, the Pearson Correlation is calculated for the 11 × 11 × 11 grid that samples the shoebox-shaped reconstruction region. As can be seen, the correlation value gets higher as the number of microphones increases. The standard deviation value mentioned in this figure specifies the standard deviation of the correlation value between different interpolating points in the rectangular reconstruction region. A high standard deviation value will then indicate a highly uneven reconstruction accuracy in which the correlation of the reconstruction varies significantly depending on the location of the interpolation. Conversely, a low standard deviation indicates that the spatial reconstruction result is stable and can be trusted. Figure 9 shows that the correlation values improve as the number of measurement points increases. Furthermore, the standard deviation also decreases significantly when more measurement points are used for the framework. This shows that while the reconstruction gets more accurate, the estimation accuracy becomes also uniformly higher across all interpolation points in the volume. One of the reason is the more measurement points available, the better the chance to estimate correctly the room modes information. It can also be observed that for the analysis within a fixed bandwidth, the performance typically becomes stable and reliable when a certain number of measurement points is reached. In the case of a lightly damped room, for instance, a grid of size 1331 within a volume of 40  can be reconstructed with a high accuracy of 98.5% using just 30 input measurement points which is an effective result for a practical number of microphones. Furthermore, even with only 20 microphones, the result is still considered stable with a trusted average correlation around 95%.

thumbnail Figure 9

Analysis of the Pearson Correlation of the reconstructed sound field with respect to the number of measurement points.

4.5 Experimental results

The reconstruction framework is now applied to actual measurements inside the reverberation chamber at EPFL, which has the same geometry as the FEM model (Fig. 10). The source, a custom-made subwoofer in a closed wooden cabinet, is located at a corner in the room to excite all room modes at low frequencies. The microphones (PCB 378B02 1/2″ microphones) are spread randomly in the room to replicate the previous numerical analysis. The reference velocity of the source is measured with a laser velocimeter (Polytec OFV 500) placed in front of the loudspeaker diaphragm.

thumbnail Figure 10

Measurement set up in a real reverberation chamber in the laboratory.

Two main methods can be used to evaluate the reconstruction results. One method is to directly compare the reconstructed sound field to the simulated one in FEM. This method can certainly verify the faithfulness of the spatial reproduction results but is not recommended for a point-by-point comparison as it is difficult to accurately match the FEM model with the real one, since it relies on the absorbing properties of the room which are not accurately known. Moreover, the reference used for processing the RFRs can be different between the simulation (volume flow) and the actual case (velocity) and it is difficult to accurately match the source excitation as well as its position. Thus, besides this method, a small part of the measurement points can be reserved to serve as an evaluation set. Combining these two evaluation methods provides a more concrete analysis of the reconstruction framework with the experimental data.

In this experiment, the signals from 25 microphone positions are used as the inputs of the algorithm to reproduce the sound field up to 75 Hz (within which about 20 modes can be observed). The microphones are located randomly in the room but were chosen so that they are practically evenly distributed space-wise to cover the area of the reconstructed rectangular volume. Figure 11 shows the spatial comparison between the reconstructed sound field and the reference one obtained from numerical simulation regarding the same eigenmodes. As the numerical model cannot be perfectly matched with the real room, there is a small difference in terms of the exact frequency of the eigenmodes. Comparing these results, it can be seen that similarly to the numerical results in the previous section, the reconstructed sound fields from real measurements yield highly accurate spatial recovery. The mode shapes are visible and the locations of nodal lines are correctly depicted with high spatial resolution. Once again, small mismatch in a few points is to be expected but the overall spatial representation remains to be faithful.

thumbnail Figure 11

Sound field reconstruction from real measurements (bottom) at two eigenfrequencies (left: near 35 Hz, right: near 51 Hz) as compared to the same eigenmodes from simulations (top).

Using the evaluation set of 30 other microphone signals within the domain of interest, the results also agree with the previous simulation validation. Using SOMP for modal estimation, the average correlation stays at 97.8% with 1.89% of standard deviation. As expected, this result is slightly less accurate than the average correlation obtained with simulations but is still highly reliable. As mentioned earlier, SOMP is particularly robust and can perform well even when a priori information is missing. On the other hand, under the same circumstance, using a semi-supervised RFP curve fitting gives a slightly lower average correlation of 96.9% with a higher standard deviation of 2.3%. This result also agrees with the analysis in Section 4.2 regarding the different nature of the two methods. Three different examples of the reconstructed RFRs by both RFP and SOMP are plotted in Figure 12 and compared to the actual measurements from the evaluation set. Generally, without a detailed supervision and calibration, the RFP method will return a slightly less accurate result than SOMP as can be observed from the figure. However, its processing speed is much faster and hence could allow for re-calibration depending on the situation.

thumbnail Figure 12

RFR reconstruction from 25 measurements for three different evaluation points in the room using RFP and SOMP (correlation ranging between 97% and 99%).

It should be noted that in practice, when the room geometries and wall absorption coefficient are not known, evaluation set like this along with the comparison parameters such as the correlation values are among the few available indications to know whether the reconstruction results are reliable. Therefore, practically, it is advised to always have a reserved evaluation set inside the domain of interest to navigate the adequate number of microphones required for any certain objective.

Regarding the experiment set-up, as the number of microphones is practically small, a blind random placement of microphones might leave out crucial areas of the room. Hence out of all the possible randomization, it is advised to choose an appropriate placement that does not leave out crucial areas of the region of interest. Moreover, placement technique like the one suggested in [30] might also be used to improve the recovery results. Lastly, it should be noticed that the measurement was conducted in a reverberation chamber without removing the reflective diffusing panels (Fig. 10). This shows that the framework is robust enough to perform well even in a practical non-empty room.

5 Conclusion

In this paper, we have investigated a robust sound field reconstruction framework in a room at low frequencies. Through modal decomposition and plane wave approximation of mode shape functions, the framework allows recovering the entire sound pressure distribution of the room at any frequency within the concerned bandwidth, from a limited set of measurements. Within the framework, the performance of two different modal estimation methods in the time and frequency domain, namely SOMP and RFP, are compared. Both methods are shown to allow retrieving the modal parameters of the room. Between the two approaches, SOMP has been proven to be more robust whereas RFP has significant advantages in terms of computational cost.

The space-wise analysis of the reconstruction results confirms the practical applicability of the framework in the field of modal equalization. The reconstruction is performed inside a non-rectangular reverberation chamber using 20–30 microphones, which are proven sufficient to address the bandwidth of interest (containing around 20 modes). The results first show that the reconstruction is highly accurate for a lightly damped room. The framework is further tested by increasing the global absorption of the room walls. For these cases, the reconstruction shows a slight reduction in terms of accuracy, especially for positions close to the walls. This slight drop in accuracy is anticipated as it is generally more challenging to retrieve the modal parameters for highly damped room. Nevertheless, the overall reconstruction results retain a sufficiently high level of reliability. This means that the framework may be used to assess the space-wise performance of existing passive and active modal equalization methods. More importantly, the results of the method can be used as input for on-the-fly reconfiguration of active low frequency absorbers, such as the electroacoustic absorbers developed in [7]. Such in situ reconfigurability of active devices presents interesting potential for optimizing room mode equalization in real rooms, and should be further studied.

This paper tackles the case where a single source is fixed inside of the room. Further work should focus on retrieving the entire sound field for multiple source positions in the room. The microphones placement in this research are spread randomly in the room. However, considering a low number of microphones, the accuracy could benefit from a predefined microphones placement strategy, that should be further studied.

Acknowledgments

This project was supported by the Swiss National Science Foundation (SNSF) under grant agreement 200021_169360.

Conflict of interest

Author declared no conflict of interests.

Appendix

Rational fraction polynomials and global curve-fitting

Assuming that a system is linear and of second dynamical order, its frequency response measurements can be represented as a ratio of two polynomials in the Laplace domain () using the RFP form of:(14)

Furthermore, if the system is resonant, which means that the response of the system is governed by its resonances, the frequency response function can be reformulated using the partial fractional form to highlight the poles of the system:(15)where is the k-th pole of the system and is the corresponding residue.

The curve fitting procedure focuses on minimizing the squared error J between the analytical and measured response computed at each and every frequency bin :(16)in which L is the length of the frequency vector. Now, assuming that multiple different frequency response functions of the system were measured, they will contain the same inherent poles and hence the denominator of each and every measurements should contain the same characteristic polynomial. This is a valid claim since for a resonant system such as the one in room acoustics, the modal frequencies and modal damping are the same regardless of where they are measured within the room. This simplifies the curve fitting procedure and also allows a global curve fitting [22] of the entire set of measurements to recover the parameters in equations (14) or (15).

To further avoid ill-conditioned problems, the frequency response function is reformulated using orthogonal polynomials only in the positive domain of the frequency axis using the Forsythe method [32]. This greatly simplifies the computation of the problem and although the final result is expressed in the orthogonal function expansion form, it is always possible to trace back to the form in equation (15) for information regarding the poles and residues. This method, detailed in [29], is non-iterative and sufficiently fast. Furthermore, by reasonably choosing m and n, it can compensate the effects created by out-of-band modes and hence, reduce the fitting error. As any other curve fitting method in the frequency domain, it suffers from over-fitting as well as from the lack of frequency resolution.

References

  1. H. Kuttruff: Room Acoustics. CRC Press, Boca Raton, FL, 2009. [Google Scholar]
  2. T. Cox, D. Peter: Acoustic Absorbers and Diffusers: Theory, Design and Application, CRC Press, Boca Raton, FL, 2009. [Google Scholar]
  3. Y. Yasuda, A. Ushiyama, S. Sakamoto, H. Tachibana: Experimental and numerical studies on reverberation characteristics in a rectangular room with unevenly distributed absorbers. Acoustical Science and Technology 27 (2006) 366–374. [Google Scholar]
  4. S. Bistafa: Adaptive control of low-frequency acoustic modes in small rooms. The Open Acoustics Journal 5 (2012) 16–22. [CrossRef] [Google Scholar]
  5. P. Dämmig, H. Deicke: Measurement uncertainty in the determination of the sound absorption in a reverberation room at low frequencies. Acta Acustica United With Acustica 33 (1975) 249–256. [Google Scholar]
  6. A. Celestinos, S.B. Nielsen: Controlled acoustic bass system (cabs) a method to achieve uniform sound field distribution at low frequencies in rectangular rooms. Journal of the Audio Engineering Society 56 (2008) 915–931. [Google Scholar]
  7. E. Rivet, S. Karkar, H. Lissek: On the optimisation of multi-degree-of-freedom acoustic impedances of low-frequency electroacoustic absorbers for room modal equalisation. Acta Acustica United With Acustica 103 (2017) 1025–1036. [CrossRef] [Google Scholar]
  8. D. Habault, E. Friot, P. Herzog, C. Pinhede: Active control in an anechoic room: Theory and first simulations. Acta Acustica United With Acustica 103 (2017) 369–378. [CrossRef] [Google Scholar]
  9. T. Ajdler, L. Sbaiz, M. Vetterli: The plenacoustic function and its sampling. IEEE Transactions on Signal Processing 54 (2006) 3790–3804. [CrossRef] [Google Scholar]
  10. Y. Haneda, Y. Kaneda, N. Kitawaki: Common-acoustical-pole and residue model and its application to spatial interpolation and extrapolation of a room transfer function. IEEE Transactions on Speech and Audio Processing 7 (1999) 709–717. [CrossRef] [Google Scholar]
  11. Y. Haneda, S. Makino, Y. Kaneda: Common acoustical pole and zero modeling of room transfer functions. IEEE Transactions on Speech and Audio Processing 2 (1994) 320–328. [CrossRef] [Google Scholar]
  12. J. Mourjopoulos, M. Paraskevas: Pole and zero modeling of room transfer functions. Journal of Sound and Vibration 146 (1991) 281–302. [Google Scholar]
  13. R. Mignot, G. Chardon, L. Daudet: Low frequency interpolation of room impulse responses using compressed sensing. IEEE/ACM Transactions on Audio, Speech, and Language Processing 22 (2014) 205–216. [Google Scholar]
  14. S.A. Verburg, E. Fernandez-Grande: Reconstruction of the sound field in a room using compressive sensing. The Journal of the Acoustical Society of America 143 (2018) 3770–3779. [CrossRef] [PubMed] [Google Scholar]
  15. M. Karjalainen, T. Paatero, J.N. Mourjopoulos, P.D. Hatziantoniou: About Room Response Equalization and Dereverberation. IEEE, New Paltz, NY, 2005, pp. 183–186. [Google Scholar]
  16. M. Karjalainen, A. Makivirta, P. Antsalo, V. Valimaki: Low-frequency modal equalization of loudspeaker-room responses, in AES 111th Convention, 21–24 September, New York, NY. 2001. [Google Scholar]
  17. E. Rivet, S. Karkar, H. Lissek: Broadband low-frequency electroacoustic absorbers through hybrid sensor-/shunt-based impedance control. IEEE Transactions on Control Systems Technology 25 (2017) 63–72. [CrossRef] [Google Scholar]
  18. P.M. Morse, K.U. Ingard: Theoretical Acoustics. Princeton University Press, Princeton, NJ, 1986. [Google Scholar]
  19. A. Moiola, R. Hiptmair, I. Perugia: Plane wave approximation of homogeneous helmholtz solutions. Zeitschrift fur Angewandte Mathematik und Physik 62 (2011) 809–837. [CrossRef] [MathSciNet] [Google Scholar]
  20. J.A. Tropp, A.C. Gilbert, M.J. Strauss: Algorithms for simultaneous sparse approximation. Part i: Greedy pursuit. Signal Processing 86 (2006) 572–588. [Google Scholar]
  21. G. Chardon, L. Daudet: Optimal subsampling of multichannel damped sinusoids, in 2010 IEEE Sensor Array and Multichannel Signal Processing Workshop. 2010. [Google Scholar]
  22. M.H. Richardson, D.L. Formenti: Global curve fitting of frequency response measurements using the rational fraction polynomial method, in International Modal Analysis Conference and Exhibit. 1985. [Google Scholar]
  23. E. Rivet, S. Karkar, H. Lissek, T. Thorsen, V. Adam: Experimental assessment of low-frequency electroacoustic absorbers for room modal equalization in actual listening rooms, in AES 140th Convention, 4–7 June, Paris, France. 2001, p. 9505. [Google Scholar]
  24. P. Leopardi: A partition of the unit sphere into regions of equal area and small diameter. Electronic Transactions on Numerical Analysis 25 (2006) 309–327. [Google Scholar]
  25. A.H. Barnett, T. Betcke: Stability and convergence of the method of fundamental solutions for Helmholtz problems on analytic domains. Journal of Computational Physics 227 (2008) 7003–7026. [Google Scholar]
  26. R. Hiptmair, A. Moiola, I. Perugia: Plane wave discontinuous galerkin methods for the 2D Helmholtz equation: Analysis of the p-version. SIAM Journal on Numerical Analysis 49 (2011) 264–284. [Google Scholar]
  27. W.C. Sabine: Architectural Acoustics, (1900; reprinted by Dover, New York, 1964). [Google Scholar]
  28. L.L. Beranek: Acoustic Measurements. Wiley, New York, NY, 1965. [Google Scholar]
  29. M.H. Richardson, D.L. Formenti: Parameter estimation from frequency response measurements using rational fraction polynomials, in Proceedings of the 1st International Modal Analysis Conference. 1982, pp. 167–181. [Google Scholar]
  30. G. Chardon, A. Cohen, L. Daudet: Sampling and reconstruction of solutions to the Helmholtz equation. Sampling Theory in Signal and Image Processing 13 (2014) 67–90. [Google Scholar]
  31. N. Antonello, E. De Sena, M. Moonen, P.A. Naylor, T. van Waterschoot: Joint acoustic localization and dereverberation through plane wave decomposition and sparse regularization. IEEE/ACM Transactions on Audio, Speech, and Language Processing 27 (2019) 1893–1905. [Google Scholar]
  32. G.E. Forsythe: Generation and use of orthogonal polynomials for data-fitting with a digital computer. Journal of the Society for Industrial and Applied Mathematics 5 (1957) 74–88. [CrossRef] [Google Scholar]

Cite this article as: Pham Vu T & Lissek H. 2020. Low frequency sound field reconstruction in a non-rectangular room using a small number of microphones. Acta Acustica, 4, 5.

All Figures

thumbnail Figure 1

Examples of room modes in a non-rectangular room.

In the text
thumbnail Figure 2

Geometry of the FEM model. The black dots represent the measurement points that are spread randomly in the room.

In the text
thumbnail Figure 3

Modal decay times for the eigenfrequencies of the non-rectangular reverberant room as estimated by the SOMP and RFP methods, in comparison with the FEM analysis (reference).

In the text
thumbnail Figure 4

Reconstruction of the RFR at a point inside the room using SOMP and RFP in comparison with the FEM ground truth reference.

In the text
thumbnail Figure 5

Comparison between the RFRs interpolated with RFP (top) and with SOMP (bottom), when underestimating the number of modes, at a given virtual microphone position. The black curve represents the reference given by FEM simulation.

In the text
thumbnail Figure 6

Sound field reconstruction (bottom) at different frequencies for a rectangular area inside the room in comparison with the referencing sound fields from numerical simulation (top).

In the text
thumbnail Figure 7

Sound field reconstruction (bottom) compared to the reference (top) for different cases of wall damping (α = 0.01, 0.10, and 0.30 from left to right) at the eigenfrequency around 35.3 Hz.

In the text
thumbnail Figure 8

Comparison of the normalized error between different cases of room absorption for the reconstruction of the rectangular volume and a plane near a wall of the room.

In the text
thumbnail Figure 9

Analysis of the Pearson Correlation of the reconstructed sound field with respect to the number of measurement points.

In the text
thumbnail Figure 10

Measurement set up in a real reverberation chamber in the laboratory.

In the text
thumbnail Figure 11

Sound field reconstruction from real measurements (bottom) at two eigenfrequencies (left: near 35 Hz, right: near 51 Hz) as compared to the same eigenmodes from simulations (top).

In the text
thumbnail Figure 12

RFR reconstruction from 25 measurements for three different evaluation points in the room using RFP and SOMP (correlation ranging between 97% and 99%).

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.