A model-based prediction for the MFR of binary blends is possible by applying simple mixing rules. In the following, the Arrhenius and Cragoe mixing rule will be investigated for the binary blends. Afterwards, a fitting approach to match sparse datasets with these mixing rules will be described before more complex symbolic regression models are applied on the data set. Lastly, the application of the mixing rules for ternary blends is investigated.
4.1. Modelling the MFR of Binary Blends with Traditional Mixing Rules
Figure 1 shows the measured MFR values of the binary blends. For all the blends there is a clear trend that the MFR generally decreases with an increased proportion of vH2 in the blend, as would be expected since an increase of the blend partner with lower MFR should result in lower overall mixture MFR values. When the vH2 content reaches 0.1 and below, the MFR shows a significant decrease, especially in the vH2-vB45 and vH2-vH8 blends.
Following an intensive review of different mixing rules for PP blends by
Traxler et al, the Arrhenius and Cragoe mixing rules were selected as initial models for the blends produced in this investigation due to their high predictive power [
18]. The Arrhenius model (Equation (1)) describes the logarithmic MFR of a mixture to be the sum of the logarithm applied on each individual blend partner MFR multiplied by its share in the blend [
36]. The Cragoe model (Equation (2)), on the other hand, states that the reciprocal of the logarithm of the MFR is the sum of the individual reciprocals of the MFRs of the blend partners multiplied by their proportion in the blend [
37].
The constant L in the Cragoe mixing rule is mainly dependent on the type of liquid to which the mixing rule is applied on and was set to 2000 as suggested by
Gao and Li [
38].
Figure 2 shows the application of both models to the vH2-cR14 blend. Although both models roughly match the shape of the measurements, the predicted MFR values for the different blends are consistently overestimated.
A common problem with all mixing rules is that only the MFR values of the base materials are used to calculate the mixtures. Therefore, small deviations in the measurements of the raw materials can lead to large deviations in the different compound calculations. Furthermore, if such mixture models are to be applied to existing data in a company producing different mixtures, the raw material data may not necessarily be available. Therefore, we propose a best fit approach for determining the parameters
and
according to Equation (3). Here,
may be any viscosity model capable of calculating the MFR of a binary mixture and
is the number of samples provided.
It must be noted that in this case the quadratic deviation between prediction and ground truth was chosen to optimise the parameters. Depending on the application scenario, different criteria can be chosen (e.g. minimising the percentage deviation between prediction and ground truth).
Applying the calculation to the Arrhenius and Cragoe model for blend vH2-R14, the fitted values for
(cR14) can be calculated to be 14.32 for Arrhenius and 14.54 for Cragoe compared to the measured value of 14.67. For
(vH2) the values are 2.80 and 2.97 compared to the measured value of 3.34. The blend percentages used within the model remain unchanged. Using the fitted values, it can be seen in
Figure 2 that now both Arrhenius and Cragoe match almost perfectly with the real data. To quantify the improvement of this approach and to measure the performance of the models, the Mean Absolute Error (MAE) and the Coefficient of Determination (R²) are calculated.
The R² value is a statistical measure that represents the proportion of the variance for a dependent variable that is explained by an independent variable or variables in a regression model. An R² of 1 indicates that the regression predictions perfectly fit the data. Conversely, an R² of 0 indicates that the model does not explain any of the variance. A low R² value does not necessarily mean the model is inadequate; it could indicate a high level of inherent variability in the data or that the model is applied in a field with high variability. However, a comparison of the R² value for models applied on the same data set provides valuable information on the model capabilities. To quantify the prediction quality with regards to prediction error, the MAE is calculated.
By applying the fit of
and
on the different blends, the MAE can be decreased and the R² can be increased. All calculated errors and model scores can be seen in
Table 5. For all the different blends a R² greater than 0.992 can be obtained. In terms of model performance, the Cragoe model performs best for all blends except vH2-vB45.
4.2. Modelling the MFR of All Blends Utilising Symbolic Regression
Even though the MAE for the fitted Cragoe model is rather small with 0.370 being the maximum prediction error for the blend vH2-vB45, a further improvement of the models is necessary. When modelling the viscosity of more complex blends consisting of a multitude of components, the traditional binary models must be applied stepwise. For an exemplary blend of four polymers, the binary models need to be applied three times. According to Gaussian error propagation, the combined error (σ
y) of a model prediction can be calculated by applying Equation (6) [
39]:
In this equation represents the individual input variables and is the partial derivative of the output with respect to the input variable. For the assumption, that the error of the binary model is always the same, applying it for a model with the lowest MAE (0.120 for Blend vH2-vH8) would lead to an increased MAE of 0.170. For the highest prediction error that was found (0.370 for Blend vH2-vB45), the MAE would increase to 0.641 for a composition of four polymers. Furthermore, when additives or fillers with individual models for each are applied, the error would increase only more. Therefore, even though the prediction accuracy is relatively high, a further increase is necessary.
Symbolic regression (SR) is a type of machine learning that aims to discover human-interpretable symbolic models from data. Unlike traditional regression techniques, which fit parameters within a predetermined model structure, SR explores a large space of possible mathematical expressions to identify the best-fitting model [
30,
33,
34,
40]. The Python based PySR framework, which uses the SymbolicRegression.jl backend, facilitates this process through a multi-population evolutionary algorithm [
40]. This algorithm involves several key steps, illustrated in
Figure 3.
First, a population of random mathematical expressions is chosen. From this population, the fittest individuals are selected based on a fitness function. By applying genetic operators such as mutation, crossover and simplification, the individuals evolve towards a population that provides better solutions. The evolutionary loop is further enhanced by simulated annealing, age-regulated evolution, and a unique evolve-simplify-optimise cycle that iteratively refines both the structure and constants of the expressions [
40,
41].
For modelling the binary blends with and without additives, it was found, that the Arrhenius and Cragoe models already achieve high model scores (R²). Building on this finding, the PySR framework can be used to derive improved mixing rules for polymer blends by focusing on sub-components of this formula, such as
alongside other fundamental operations like linear, exponential, and logarithmic functions. To reduce the risk of overfitting, which occurs when a model learns the noise and random fluctuations in the training data rather than the underlying pattern, the dataset was expanded fivefold by introducing an empirically chosen gaussian noise of 0.1% and by swapping the input variables (
,
,
,
) to ensure the robustness and bidirectional applicability of the derived mixing rules. This approach helps ensure that the final symbolic models will generalise well and reflect the underlying physical principles of polymer blend behaviour. After training with the binary blend data and choosing the equation with the minimum complexity setting of PySR (least number of mathematical expressions in the formula), Equation (7) was found.
The symbolic regression model for calculating the MFR of a binary mixture is very similar to the Arrhenius model, except for the additional fitting coefficients that adapt the model to the given data set (-1.213, -1.18 and +1.25). Compared to the symbolic regression model found by Traxler et al, the model in Equation
(7
) is much simpler and applicable in both blending directions (blending the higher MFR component with the lower MFR component as well as vice versa) [
18]. The model is applied to the two blend datasets vH2-cR14 and vH2-vB23 in
Figure 4. The symbolic regression model fits the data better than the fitted Arrhenius model with a MAE of 0.141 instead of 0.187 for vH2-cR14 and 0.172 instead of 0.214 for vH2-vB23.
To evaluate the overall performance of the newly found model and to compare it with both Arrhenius and Cragoe models, the MAE of the applied models were calculated on all blends. The parameters of
and
were determined for all models according to the optimisation approach described in 4.1. In addition, to compare the approach presented in this paper, a further evaluation was carried out on the full dataset used by Traxler et al [
18]. The results of this evaluation are shown in
Table 6.
For the binary blends of this investigation, the symbolic regression model gives the lowest MAE of 0.217 compared to the Arrhenius model (0.280) or the Cragoe model (0.233). The symbolic regression model may fit the dataset on which it was trained better than the standard Arrhenius and Cragoe models. This can also be seen when all models are applied to the dataset studied by Traxler et al. Here, the symbolic regression model (MAE of 0.266) is worse than Arrhenius (0.195) and Cragoe (0.165) [
18]. Nevertheless, when the model is compared with the error in the measurement of the MFR itself (0.168) for the binary blend dataset it was trained on, there is not much room for further improvement. A difference of 0.1 in the MFR value between prediction and measurement should be sufficient for most applications.
4.3. Application of the Mixing Rules on Ternary Blends
Compounds for industrial applications contain a variety of different materials. In addition to pure virgin compounds, different additives may be present in the form of masterbatches consisting minimal chemically active ingredients mixed in a polymer to make dosing in the final compounds easier. Therefore, the adaption of the mixing rules on blends with a multitude of blend partners is important. One possibility to apply the mixing rules such as Arrhenius to those blends is the stepwise application as discussed in 4.2. Another possibility without the individual model errors adding up for each application is to extend the Arrhenius model to be applicable for blends with arbitrary components in one calculation. To apply such a generalised model on the data of
Table 4, the Arrhenius model for binary blends was extended to the generalised
Equation
(8
) with
and
being the shares and MFR values of the individual blend partners, with the requirement that the sum of all
is one.
Applying
Equation
(8
) directly to the data of the ternary blends, a R² of 0.991 and a MAE of 0.923 can be obtained. With more than two blending partners, the problem of imprecise measurements of the pure materials MFR values or the problem of missing measurements in data available in a company as described in
Section 4.1 remains and may be greater due to the addition of individual errors. Therefore,
Equation
(3
) can be adapted to the generalised Arrhenius formula where k is the number of blend partners:
By applying Equation
(9
), the R² of the unfitted Arrhenius model can be increased to 0.999 instead of 0.991 and the MAE can be reduced to 0.355 instead of 0.923. Since the blend partners used in this investigation ranged from MFR values of 2 to 45 and the polymers represented block and homopolymers as well as a commonly used recyclate, the applicability of the Arrhenius model to multiple-partner blends can be concluded for PP. The model prediction of Equation
(8
) with the optimised MFR parameters can be seen in
Figure 7. The black dots indicate the experiments performed and the measurements taken for the optimisation. The dashed black lines mark isolines with the same MFR values in steps of 5. The logarithmic relationship of the MFR values can be seen by the distance between the isolines decreasing as the MFR increases.
4.4. Resume on the Application of Mixing Rules for Predicting the MFR
The experimental data showed that the traditional Arrhenius and Cragoe models achieved high predictive accuracy without any adjustment, with R² values exceeding 0.99 for some binary blends. By applying the proposed fitting method, the Arrhenius and Cragoe model could be significantly improved. For the Arrhenius model, the prediction MAE could be reduced to 0.280 from 0.467 and for the Cragoe model, a reduction in MAE to 0.233 from 0.309 was possible.
Furthermore, the application of the PySR framework allowed for further improvements. The symbolic regression model developed in this study yielded an R² value of 0.999 and an MAE of 0.217, demonstrating superior performance compared to traditional models.
For ternary blends, the Arrhenius model was extended to accommodate multiple blend partners. The generalised Arrhenius model achieved an R² value of 0.999 and an MAE of 0.36, indicating high predictive accuracy and supporting the hypothesis that these models can be applied to more complex blends. This extension to ternary blends suggests the robustness and applicability of these models for predicting MFR in blends with more than two components.