Article
Version 1
Preserved in Portico This version is not peer-reviewed
Cross-Feature Transfer Learning For Efficient Tensor Program Generation
Version 1
: Received: 30 December 2023 / Approved: 3 January 2024 / Online: 3 January 2024 (08:38:03 CET)
A peer-reviewed article of this Preprint also exists.
Verma, G.; Raskar, S.; Emani, M.; Chapman, B. Cross-Feature Transfer Learning for Efficient Tensor Program Generation. Appl. Sci. 2024, 14, 513. Verma, G.; Raskar, S.; Emani, M.; Chapman, B. Cross-Feature Transfer Learning for Efficient Tensor Program Generation. Appl. Sci. 2024, 14, 513.
Abstract
Tuning tensor program generation involves navigating a vast search space to find optimal program transformations and measurements for a program on target hardware. The complexity of this process is further amplified by the exponential combinations of transformations, especially in heterogeneous environments. This research addresses these challenges by introducing a novel approach that learns the joint neural network and hardware features space, facilitating knowledge transfer to new, unseen target hardware. A comprehensive analysis is conducted on the existing state-of-the-art dataset, TenSet, including a thorough examination of test split strategies and the proposal of methodologies for dataset pruning. Leveraging an attention-inspired technique, we tailor the tuning of tensor programs to embed both neural network and hardware-specific features. Notably, our approach substantially reduces the dataset size by up to 53% compared to the baseline without compromising Pairwise Comparison Accuracy (PCA). Furthermore, our proposed methodology demonstrates competitive or improved mean inference times with only 25%40% of the baseline tuning time across various networks and target hardware. The attention-based tuner can effectively utilize schedules learned from previous hardware program measurements to optimize tensor program tuning on previously unseen hardware, achieving a top-5 accuracy exceeding 90%. This research introduces a significant advancement in auto-tuning tensor program generation, addressing the complexities associated with heterogeneous environments and showcasing promising results regarding efficiency and accuracy.
Keywords
auto-tuning; deep learning compilers; heterogeneous transfer learning; tensor program generation
Subject
Computer Science and Mathematics, Computer Science
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment