2.2. System Components and Modeling
The in-vehicle audio system comprises multiple components forming a complex structure. Key components include a signal generator, an audio analyzer, BNC connectors, a speaker, and a vehicle multimedia system. The transmission of signals starts with the input sound from the signal generator, traveling through BNC connectors to the audio analyzer, and then to the speaker. Ultimately, the sound waves produced by the speaker are delivered to users via the in-vehicle multimedia system. These components are critical not only for the in-vehicle audio quality but also directly influence the auditory experience of drivers and passengers. The final output obtained in the vehicle audio system is the result of multiplying the frequency responses of the system components with each other.Therefore, the frequency response of each component is among the critical factors directly impacting system performance. The system diagram is shown in
Figure 2. Subsequent sections will address the modeling of each component in terms of frequency response and examine the interactions between these responses.
-
A)
Input Sound
White noise is frequently preferred in the calibration processes of sound systems because it has equal sound intensity in a wide frequency range such as 20 Hz - 20 kHz. This feature allows for objective testing of the system’s frequency response throughout the entire frequency band, thus enabling a comprehensive analysis of system performance. Particularly in acoustic arrangements of sound systems and rooms, this broad-spectrum sound source is utilized to assess system performance and make necessary adjustments. Consequently, the responses of sound systems at various frequencies can be objectively measured and optimized.
A similar application is found in in-vehicle sound systems. Thanks to its balanced and comprehensive frequency spectrum, white noise provides an ideal test signal for accurately detecting the frequency response of sound systems across a wide frequency range. A significant advantage is that during in-vehicle acoustic adjustments using white noise signals, it is possible to interactively examine and adjust the effects of applied filters across all frequencies [
6].
Acoustic analyzers use frequency weighting curves to simulate the human ear’s sensitivity to different frequencies. The A, B, and C weighting scales, displayed in
Figure 3, represent various frequency response filters employed in sound measurements. The process of weighting adjusts the measured decibel (dB) values of sounds at specific frequencies to match the sensitivity levels perceived by the human ear at those frequencies. Given the human ear’s lower sensitivity to low frequencies and higher sensitivity to high frequencies, these weighting curves are essential for accurately reflecting perceived sound intensity. By aligning measured sound levels more closely with the natural response of the human ear, these curves enhance the effective assessment of sound’s true impact. Therefore, weighting filters are applied to the white noise signal used as the input sound.
A-weighting primarily represents the sensitivity of the human ear to ambient noise measurements at low sound levels. This scale is more sensitive between frequencies of 500 Hz to 10 kHz, while it reduces sounds at lower and higher frequencies, making it ideal for everyday environmental sound measurements. B-weighting is designed for medium to high sound levels (70 to 80 phon) and slightly emphasizes sounds across the frequency range, making it useful in environments like cinema and music production. C-weighting is used at high sound levels and provides a flatter response across a wide frequency range, measuring low and high-frequency sounds at nearly their original levels. This accuracy is especially required in industrial settings and at concerts [
11].
B-weighting is designed to better reflect the sensitivity of the human ear at medium to high sound levels. In setting vehicle sound systems, it is necessary to regulate the system based on the mid to upper sound levels, hence the use of B-weighting shown in
Figure 3, which presents the frequency response from 20 Hz to 20 kHz. In this context, our study is based on the equal-loudness contour at 1000 Hz with an amplitude level of 80 dB as shown in
Figure 1. After applying the B-weighting process to the white noise signal, it has been used as the input sound in our study. The frequency response of this input sound is depicted in
Figure 4.
-
A)
Parametric Equalizer Filters
The parametric equalizer filter is a commonly used tool in audio processing and music production and has become a standard component in automotive multimedia systems. This filter adjusts the audio signal by targeting specific frequency ranges, thereby facilitating the adjustment of the sound’s tonal balance. The parametric equalizer is adjusted based on three primary parameters: center frequency (f0), bandwidth (or Q factor), and gain (G). The gain modifies the intensity of the sound signal by increasing or decreasing the amplitude within the selected frequency band. The bandwidth determines the range of frequencies where the filter is effective and is usually expressed in octaves. The Q factor is defined as the inverse of the ratio of the bandwidth to the center frequency and indicates the sharpness of the filter. A high Q factor means a narrower bandwidth and a sharper filter response. These parameters allow users to finely tune how narrowly or broadly they want to affect a specific frequency, thereby precisely achieving the desired sound characteristics.
The parametric equalizer is a crucial tool in audio processing, primarily incorporating various filter types such as peaking (bell), shelving, and notch filters. In vehicle multimedia systems, the peaking filter is particularly favored. This filter facilitates the adjustment of in-vehicle sound systems in accordance with the principle of equal loudness through its advantages such as frequency adjustment flexibility, tone control, and focused frequency intervention. The peaking filter is designed to either emphasize or attenuate signals within a specific frequency band and plays a critical role in sound processing and equalization [
12].
The mathematical model of the peaking filter is represented by a transfer function
H(
f) that describes its effect on the frequency domain of the signal. This transfer function defines the relationship between the input and output of the signal as a function of frequency and is commonly formulated as Equation (1).
Here,
f represents the frequency under study,
f0 is the center frequency (the frequency to be emphasized or attenuated),
G denotes the gain (in amplitude level, measured in dB), and
Q represents the quality factor (which inversely defines the bandwidth of the filter) [
5].
According to the principle of equal loudness, precise regulation of different frequency bands is necessary. Therefore, multiple parametric equalizer filters are generally used. The ISO 226:2003 standard defines the characteristics of parametric filters required to achieve an ideal equal loudness contour. In accordance with this standard, ten parametric filters are adjusted at specified frequencies, with corresponding gain and Q factor values, designed to provide optimal sound correction. Each of these filters is set according to the frequency, gain, and Q factor values detailed in
Table 1.
In this study, ten parametric equalizer filters have been utilized to ensure an ideal sound experience in vehicle multimedia systems. A filter order of 12 was chosen, which is aligned with the order of existing filters in in-vehicle multimedia sound systems. The filters were created using the fdesign.parameq function available in the DSP System Toolbox library of Matlab R2021b software. This function allows users to design a parametric equalizer filter with specified parameters.
-
C)
Amplifier
Amplifiers serve as fundamental power-boosting devices in sound systems. In vehicle multimedia systems, amplifiers increase the amplitude of the received audio signal, enabling speakers to produce sound at higher volumes and with higher quality. This improves the signal-to-noise ratio (SNR), enhancing the clarity and detail of the sound while minimizing distortions.
In vehicle entertainment systems, Class-D amplifiers are particularly preferred. These amplifiers are advantageous due to their high energy efficiency and quality sound output. Class-D amplifiers process audio signals in digital format, amplifying them directly without converting them back to analog signals. This process allows them to achieve high sound levels with less energy consumption compared to traditional analog amplifiers. Additionally, these amplifiers provide a balanced and consistent response across a wide frequency range, making them ideal for music and sound effects [
13]. In this study, an amplifier suitable for 4-ohm speakers was selected.
Figure 5 details the frequency response of the chosen amplifier at 4 ohms.
-
D)
Speaker
Speakers function as the final output component of sound systems; they convert filtered and amplified audio signals into physical sound waves and deliver them to listeners.
The types of speakers used in vehicle sound systems are specially designed to provide optimal performance across different frequency ranges. Essentially, these speakers are categorized into four main types to cover low, mid, and high frequencies: subwoofer/woofer, mid-range, tweeter, and full-range speakers. Each type of speaker is optimized for a specific frequency range, and the frequency ranges of these speakers are detailed in
Table 2 [
6].
The type of speaker used in this study is the Full-Range speaker, which covers a wide frequency range.
Figure 6 shows the frequency response of the Full-Range speaker. The manufacturer states that the effective operating frequency of this speaker is between 85 Hz and 12.5 kHz. This wide frequency range indicates that the speaker can adequately produce both low-frequency bass sounds and high-frequency treble sounds. In this study, the modeling of the speaker was based on this frequency response curve.
2.3. Modeling and Optimization of System Output
In this study, to ensure an ideal audio experience in vehicle multimedia systems, the processes of signal processing, filtering, integration of amplifier and speaker responses were detailed, and the combined effect of these components was optimized. While creating the system model, a B-weighting white noise signal was used as the input signal, followed by the integration of parametric equalizer filters, amplifier, and speaker components sequentially. This integration is based on the cascading method, where the output of each system component serves as the input for the next component. This method allows for the step-by-step processing of the signal and the sequential application of each component’s effect.
In this cascading process, the role of convolution, one of the fundamental concepts of signal processing theory, is of great importance. Convolution, as a mathematical operation, involves applying a system response (e.g., the frequency response of an equalizer filter or an amplifier) to an input signal. This process is essentially performed by “folding” each point of the input signal with the system response and summing the results.
The convolution theorem explains the frequency domain representation of this process: the convolution of two signals (input and system response) in the time domain is equivalent to the multiplication of these signals in the frequency domain. This transformation is shown in Equation (2).
This property allows engineers and designers in the field of signal processing, particularly in filter design and audio processing applications, to perform complex signal processing operations more efficiently. In this study, since the system components were modeled in the frequency domain, the convolution operation was also performed in this domain [
14].
To model the resulting signal at the system output, the amplitude responses of the input signal, amplifier, and speaker components described in
Figure 4,
Figure 5 and
Figure 6 were first imported into Matlab. Since the signals were defined in the frequency domain, the frequency responses of the parametric equalizer filters were calculated using the “freqz” function, and their amplitude responses were obtained. Here, a sampling frequency of 96 kHz, which is also used in the actual setup, was employed. The convolution of these obtained signals was performed by multiplication, as shown in Equation (2), and the speaker output was modeled.
In this study, statistical analyses were used to measure the performance by comparing the experimental results obtained from the speaker output. Pearson correlation analysis and root mean square error (RMSE) methods were utilized for this purpose.
Pearson correlation analysis is a statistical method used to measure the linear relationship between two data sets. This analysis helps determine how two variables change together and produces a correlation coefficient (r) between -1 and 1. A correlation coefficient close to 1 indicates a strong positive linear relationship, while a value close to -1 indicates a strong negative linear relationship. A value near 0 suggests no relationship between the two data sets. In this study, Pearson correlation analysis was used to examine the linear relationship between the obtained results. The Pearson correlation coefficient is calculated using Equation (3).
Here x and y refer to the data sets, and n refers to the total number of data points.
Root mean square error (RMSE) is a method used to measure the magnitude of differences between two data sets. RMSE evaluates the deviations among the obtained results and quantitatively indicates the amount of error. A low RMSE value indicates that the results are close to each other. RMSE is calculated using Equation (4).
These statistical analyses played a critical role in evaluating the agreement between the ideal contour and the experimental and simulation results. While Pearson correlation analysis determined the linear relationship between the results, RMSE analysis quantitatively measured the accuracy of this relationship [
15].
To align the signal output from the speaker with the ideal contour given by the equal loudness principle, the optimization of filter parameters was performed. Metaheuristic algorithms were preferred for the optimization. Metaheuristic algorithms are methods capable of conducting effective and efficient searches over large solution spaces and are not specific to a particular problem. In this study, the genetic algorithm (GA), one of the metaheuristic algorithms, was chosen.
The genetic algorithm is an optimization technique based on the principles of biological evolution and operates on solution sets called chromosomes. Initially, an initial population consisting of random solutions is created. Then, the fitness of each solution is evaluated according to a specific objective. Solutions with higher fitness values are selected to have greater representation in subsequent generations. Crossover operations are performed among the selected solutions to create new solutions (offspring), and small random changes (mutations) are applied to the solutions obtained from the crossover. Finally, the population is updated with the new solutions, and the process is repeated until a certain fitness value is achieved or a specified number of iterations is reached [
16].
Genetic algorithms have the capability to operate effectively in large and complex solution spaces. In terms of flexibility and adaptability, genetic algorithms can be tailored to various optimization problems and provide flexibility for different objectives [
17]. Given the numerous frequency and parameter combinations in vehicle audio systems, genetic algorithms are ideal for addressing such problems.
In this study, the “optimoptions” and “ga” functions from MATLAB’s Global Optimization Toolbox were used for the GA. The GA was configured with a population size of 100 individuals. Thus, 100 different solutions were evaluated in each generation, allowing for a more comprehensive search of the solution space. Additionally, the algorithm was allowed to run for a maximum of 100 generations, ensuring that the algorithm had sufficient time to search for a solution while controlling the processing time.
The fitness function was determined by comparing the speaker output obtained in each iteration with the ideal contour. Using statistical methods, the differences between the ideal contour and the obtained results were calculated, and these differences were used as the output of the objective function. The genetic algorithm iteratively updated the filter parameters (f0, Q and G) to maximize these outputs and find the optimal solution.