Lou, W.; Brito, L.F.; Zhao, X.; Bonfatti, V.; Li, J.; Wang, Y. Selection of the Most Informative Wavenumbers to Improve Prediction Accuracy of Milk Fatty Acid Profile Based on Milk Mid‐infrared Spectra Data. Animal Research and One Health 2024, doi:10.1002/aro2.72.
Lou, W.; Brito, L.F.; Zhao, X.; Bonfatti, V.; Li, J.; Wang, Y. Selection of the Most Informative Wavenumbers to Improve Prediction Accuracy of Milk Fatty Acid Profile Based on Milk Mid‐infrared Spectra Data. Animal Research and One Health 2024, doi:10.1002/aro2.72.
Lou, W.; Brito, L.F.; Zhao, X.; Bonfatti, V.; Li, J.; Wang, Y. Selection of the Most Informative Wavenumbers to Improve Prediction Accuracy of Milk Fatty Acid Profile Based on Milk Mid‐infrared Spectra Data. Animal Research and One Health 2024, doi:10.1002/aro2.72.
Lou, W.; Brito, L.F.; Zhao, X.; Bonfatti, V.; Li, J.; Wang, Y. Selection of the Most Informative Wavenumbers to Improve Prediction Accuracy of Milk Fatty Acid Profile Based on Milk Mid‐infrared Spectra Data. Animal Research and One Health 2024, doi:10.1002/aro2.72.
Abstract
Milk MIR spectra have been shown to provide valuable information on a wide range of traits to be used in dairy cattle breeding programs. Selecting the most informative variables from complex data can improve prediction accuracy and model robustness and, consequently, the interpretability of MIR spectra. Thus, we aimed to investigate the prediction performance of feature selection methods based on MIR spectra data, using the milk fatty acid (FA) profile as an example to illustrate the evaluated procedure. Data of MIR spectra, milk test-day records, and reference FA concentrations of 155 first-parity Holstein cows were used in the analyses. Four models comprising different explanatory variables and five feature selection methods were evaluated. The results indicated that the Competitive Adaptive Reweighted Sampling (CARS) method can effectively select the most informative variables from the MIR spectra, resulting in higher prediction accuracies than other variable selection approaches. The model including selected MIR spectra and cow information variables [days in milk at the test day, age at the test day, pregnancy stage (in days), number of days open, number of inseminations, and somatic cell count] yielded the best FA profile predictions based on Partial Least Square regression. In particular, ten FAs (C8:0, C10:0, C14:1, C17:0 isomers, C18:1, C18:1 isomer, medium-chain FA, unsaturation FA, monounsaturated FA, and polyunsaturated FA) presented accuracies based on the determination coefficient (R2cv) ranging from 0.66 to 0.85 in internal validation and from 0.65 to 0.84 in external validation. By running CARS 1,000 times in internal validations, we obtained the frequency of selected milk MIR wavenumber for 35 FAs. The most related wavenumbers to FAs were found within 1,003 to 1,145 cm-1, while other discrete areas were between 1,651 to 1,797 and 2,834 to 2,954 cm-1. These biomarkers may give insights into the relationship between MIR spectra and FA phenotypes. In conclusion, using CARS and cow information improved predictions of FAs based on MIR spectra in Chinese Holstein dairy cows. Additional validation studies should be conducted as larger datasets become available.
Biology and Life Sciences, Animal Science, Veterinary Science and Zoology
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.