Guilhaumon, C.; Hascoët, N.; Chinesta, F.; Lavarde, M.; Daim, F. Data Augmentation for Regression Machine Learning Problems in High Dimensions. Computation2024, 12, 24.
Guilhaumon, C.; Hascoët, N.; Chinesta, F.; Lavarde, M.; Daim, F. Data Augmentation for Regression Machine Learning Problems in High Dimensions. Computation 2024, 12, 24.
Guilhaumon, C.; Hascoët, N.; Chinesta, F.; Lavarde, M.; Daim, F. Data Augmentation for Regression Machine Learning Problems in High Dimensions. Computation2024, 12, 24.
Guilhaumon, C.; Hascoët, N.; Chinesta, F.; Lavarde, M.; Daim, F. Data Augmentation for Regression Machine Learning Problems in High Dimensions. Computation 2024, 12, 24.
Abstract
Machine learning approaches are currently used to understand or model complex physical systems. In general, a substantial number of samples must be collected to create a model with reliable results. However, collecting numerous data is often relatively time-consuming or expensive. Moreover, the problems of industrial interest tend to be more and more complex and depending on a high number of parameters. High dimensional problems intrinsically involve the need of large data amount through the curse of dimensionality. That is why, new approaches based on smart sampling techniques are investigated to minimize the number of samples to be given to train the model, such as Active Learning methods. Here, we propose a technique based on a combination of Fisher information matrix and of Sparse Proper Generalized Decomposition that enables the definition of a new Active Learning informativeness criterion in high dimensions. We provide examples proving the performances of this technique on a theoretical 5D polynomial function and on an industrial crash simulation application. The results prove that the proposed strategy over-perform the usual ones.
Keywords
Active Learning; Design of experiments; Regression; s-PGD
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.