Submitted:
02 April 2025
Posted:
03 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Fundamentals of Tensor Decomposition
2.1. Definition and Notation
2.2. Basic Tensor Operations
- Outer Product: Given vectors , , and , their outer product forms a rank-one tensor:where each element is given by .
- Mode-n Product: The multiplication of a tensor with a matrix along mode n is defined as:where the resulting tensor has dimensions [14].
- Frobenius Norm: The norm of a tensor is given by [15]:
2.3. Low-Rank Representation and Importance in Applications
- Signal Processing: Source separation, blind deconvolution, and multi-way filtering [17].
- Machine Learning: Dimensionality reduction, data fusion, and knowledge discovery [18].
- Computer Vision: Image compression, multi-view learning, and feature extraction [19].
- Biomedical Engineering: Brain imaging, genomic analysis, and medical signal processing.
3. Tensor Decomposition Methods
3.1. CANDECOMP/PARAFAC (CP) Decomposition
3.1.1. Properties and Uniqueness
3.1.2. Algorithms for CP Decomposition
- Alternating Least Squares (ALS): The most commonly used method, which iteratively updates one factor matrix at a time while keeping others fixed.
- Gradient-Based Methods: These include stochastic gradient descent (SGD) and conjugate gradient techniques to improve convergence [29].
- Randomized and Approximate Methods: Tensor sketching and randomized SVD are used to accelerate CP decomposition for large-scale data [30].
3.1.3. Applications
- Signal Processing: Blind source separation, multi-sensor data fusion, and channel estimation [31].
- Machine Learning: Topic modeling, recommendation systems, and deep learning compression.
- Neuroscience: EEG and fMRI data analysis for identifying brain activity patterns.
3.2. Tucker Decomposition
3.2.1. Advantages Over CP Decomposition
- Better interpretability in applications like dimensionality reduction.
- Improved compression capabilities in image and video processing.
- Efficient representation of large-scale tensors with controlled rank [33].
3.2.2. Computation and Algorithms
- Higher-Order SVD (HOSVD): An extension of matrix SVD applied sequentially along each mode [34].
- Higher-Order Orthogonal Iteration (HOOI): An iterative refinement approach that improves the factorization accuracy.
3.2.3. Applications
3.3. Tensor Train Decomposition
3.3.1. Advantages
- Scalability: TT decomposition scales logarithmically with tensor order, making it suitable for extremely high-dimensional data.
- Efficient Computation: Operations like tensor contraction and matrix-vector multiplication become computationally feasible.
3.3.2. Applications
3.4. Other Advanced Decomposition Methods
3.4.1. Tensor Ring Decomposition
3.4.2. Sparse and Nonnegative Tensor Decompositions
3.5. Comparison of Tensor Decomposition Techniques
4. Theoretical Advancements and Open Problems in Tensor Decomposition
4.1. Uniqueness and Identifiability of Tensor Decomposition
4.1.1. Uniqueness of CP Decomposition
4.1.2. Uniqueness of Tucker Decomposition
4.2. Tensor Rank Estimation and Low-Rank Approximation
4.2.1. Challenges in Determining Tensor Rank
4.2.2. Low-Rank Tensor Approximation
4.3. Computational Hardness and Approximation Guarantees
- Computing the rank of a given tensor is NP-hard for tensors of order three or higher.
- Finding the best rank-R CP decomposition is NP-hard in general [54].
- Tucker decomposition involves solving large-scale SVD problems, which can be computationally prohibitive.
4.4. Robustness and Stability in Noisy Environments
4.4.1. Tensor Decomposition with Missing Data
- Low-Rank Tensor Completion: Uses nuclear norm minimization to estimate missing entries.
- Bayesian Tensor Factorization: Incorporates probabilistic priors to model uncertainty.
- Graph-Based Completion: Leverages relational structures in the data to infer missing values [55].
4.4.2. Robust Tensor Decomposition in Noisy Settings
- Sparse and Robust CP Decomposition: Incorporates norm regularization to suppress outliers [19].
- Total Variation (TV) Regularization: Enforces smoothness in tensor factorization for denoising applications.
- Bayesian Nonparametric Models: Uses hierarchical priors to adaptively model noise distributions.
4.4.3. Adversarial Robustness in Machine Learning Applications
- Defending Against Tensor-Based Adversarial Attacks: Developing regularization techniques to enhance model security.
- Certifiable Robustness of Tensor Factorization: Establishing theoretical guarantees on robustness under perturbations [56].
- Adversarial Training for Tensor Networks: Enhancing resilience of tensor models to adversarial manipulations.
4.5. Theoretical Connections Between Tensor Methods and Other Fields
4.5.1. Tensors and Algebraic Geometry
- Tensor Rank Bounds: Using algebraic varieties to establish rank constraints [59].
- Secant Varieties and Decomposability: Studying geometric conditions for unique decomposability.
- Homotopy Methods for Tensor Factorization: Leveraging algebraic topology for efficient decomposition algorithms.
4.5.2. Tensors and Quantum Information Theory
- Tensor Network Representations of Quantum States: Efficiently encoding quantum many-body systems [60].
- Entanglement Entropy and Tensor Ranks: Analyzing the complexity of quantum entanglement using tensor factorizations [61].
- Quantum Algorithms for Tensor Factorization: Exploring quantum-inspired methods for high-dimensional tensor decomposition.
4.6. Conclusion
5. Applications of Tensor Decomposition in Signal Processing and Machine Learning
5.1. Applications in Signal Processing
5.1.1. Blind Source Separation
- CP Decomposition for BSS: The CP decomposition is particularly useful in scenarios where the observed signal is modeled as a sum of rank-one components, each corresponding to an independent source [66].
- Tucker Decomposition for Multimodal Data: When multiple sources are recorded through different modalities (e.g., EEG and fMRI in neuroscience), Tucker decomposition enables joint analysis of multimodal data [67].
- EEG and fMRI Analysis: Identifying independent brain activity sources.
- Speech Processing: Separating overlapping speech signals in audio recordings.
- Wireless Communications: Decoupling multiple transmitted signals in MIMO systems [68].
5.1.2. Array Signal Processing
- Direction of Arrival (DOA) Estimation: Tensor-based subspace methods improve the accuracy of DOA estimation in multi-antenna systems.
- Beamforming: Tensor decompositions help design optimal beamforming weights for interference suppression [70].
- Channel Estimation: In MIMO communication, tensor methods improve the estimation of channel state information (CSI) [71].
5.1.3. Compressed Sensing and Sparse Signal Recovery
5.2. Applications in Machine Learning
5.2.1. Dimensionality Reduction and Feature Extraction
- Tucker Decomposition for Feature Selection: By extracting low-dimensional representations of data, Tucker decomposition improves classification and clustering performance.
- CP Decomposition in Natural Language Processing (NLP): Tensor factorization helps in word embedding models and topic modeling.
- Face Recognition: Tensor-based feature extraction improves accuracy in facial recognition systems.
- Text Mining: Tensor-based topic models reveal latent structures in large text corpora.
- Recommender Systems: Tensor decomposition enhances collaborative filtering by capturing higher-order user-item interactions.
5.2.2. Deep Learning Compression and Acceleration
- Tensor Train Decomposition for Model Compression: TT decomposition reduces the number of parameters in fully connected layers, making deep learning models more efficient [77].
- CP and Tucker Decompositions for Convolutional Neural Networks (CNNs): These methods help decompose convolutional filters, reducing computational complexity in CNNs [78].
- Edge and Mobile AI: Deploying efficient neural networks on resource-constrained devices.
- Autonomous Vehicles: Optimizing deep learning models for real-time object detection.
- Medical Diagnosis: Improving efficiency in AI-driven medical image analysis [79].
5.2.3. Knowledge Graph Completion and Graph Learning
- Link Prediction: Identifying missing relationships in knowledge graphs.
- Graph Embedding: Representing nodes in a low-dimensional space for improved clustering and classification.
- Biomedical Research: Discovering new drug interactions and gene-disease associations.
- Social Network Analysis: Detecting hidden patterns in social media interactions.
- Recommendation Systems: Enhancing personalized recommendations based on multi-relational data.
5.3. Comparison of Tensor Decomposition Applications
5.4. Challenges and Future Directions
- Scalability: Handling large-scale tensors efficiently remains a challenge [80].
- Computational Complexity: Many tensor decomposition algorithms are iterative and require careful optimization.
- Interpretability: While tensor methods provide compact representations, interpreting the results in real-world applications can be complex.
- Efficient Parallel and Distributed Algorithms: Leveraging GPU and cloud computing for large-scale tensor computations [81].
- Hybrid Models: Integrating tensor decomposition with deep learning for enhanced performance [82].
- Robustness and Generalization: Developing tensor methods that are robust to noise and missing data [83].
6. Computational Challenges and Optimization Techniques for Tensor Decomposition
6.1. Computational Complexity of Tensor Decomposition
6.1.1. CP Decomposition Complexity
6.1.2. Tucker Decomposition Complexity
6.1.3. Tensor Train and Tensor Ring Complexity
6.2. Memory Constraints and Storage Optimization
- Sparse Tensor Storage: Instead of storing all elements, sparse representations store only nonzero values and their indices, significantly reducing memory usage [93].
- Compressed Formats: Using quantization and low-bit representations to store tensor elements efficiently.
- Distributed Storage: Splitting tensors across multiple processing units in distributed computing environments.
6.3. Optimization Techniques for Tensor Decomposition
6.3.1. Alternating Least Squares (ALS) and Variants
- Regularized ALS: Adds L2 regularization to prevent overfitting and improve generalization.
- Stochastic ALS: Uses stochastic gradient updates to accelerate convergence.
- Randomized ALS: Incorporates randomization techniques to reduce computational cost [95].
6.3.2. Gradient-Based Optimization
- SGD for CP and Tucker: Updates factor matrices using mini-batches to improve scalability.
- Second-Order Optimization: Uses Newton’s method and conjugate gradient techniques to accelerate convergence.
- Momentum-Based Optimization: Incorporates momentum to avoid oscillations in the optimization landscape [97].
6.3.3. Randomized and Approximate Methods
- Randomized SVD for Tucker: Computes low-rank approximations using random projections.
- Sketching Methods: Uses tensor sketching techniques to reduce dimensionality before applying decomposition [98].
- Probabilistic Tensor Factorization: Applies Bayesian inference to estimate tensor components in an approximate manner.
6.3.4. Parallel and Distributed Computing
- GPU Acceleration: Uses CUDA-based tensor operations to parallelize matrix multiplications.
- Multi-Core Processing: Splits tensor factorization tasks across multiple CPU cores.
- Distributed Tensor Decomposition: Implements factorization on large-scale clusters using frameworks like TensorFlow and Spark.
6.4. Tensor Decomposition in Large-Scale and Streaming Data
6.4.1. Incremental and Online Tensor Decomposition
- Online CP Decomposition: Maintains a low-rank tensor approximation that evolves with streaming data [101].
- Incremental Tucker Decomposition: Updates core tensors and factor matrices without recomputing the entire decomposition.
6.4.2. Tensor Decomposition in High-Performance Computing
6.5. Future Directions in Tensor Computation Optimization
- Scalable Algorithms: Developing more efficient algorithms that can handle petabyte-scale tensors.
- Adaptive Rank Selection: Automating the selection of optimal tensor ranks for decomposition.
- Robustness in Noisy Environments: Enhancing tensor methods to handle missing and corrupted data.
- Quantum Tensor Computation: Exploring quantum algorithms for tensor decomposition to achieve exponential speedups [105].
7. Emerging Applications and Future Directions of Tensor Decomposition
7.1. Tensor Decomposition in Artificial Intelligence and Machine Learning
7.1.1. Tensor-Based Deep Learning Architectures
- Tensor Compression in Transformers: Large language models (LLMs) such as GPT and BERT can be compressed using tensor train (TT) and tensor ring (TR) decomposition, reducing the number of parameters while preserving accuracy.
- Factorized Convolutional Layers: CP and Tucker decomposition can replace standard convolutional filters in CNNs, leading to faster inference and reduced model size.
- Low-Rank Attention Mechanisms: Tensor-based attention models improve efficiency in vision transformers (ViTs) and self-attention networks [109].
7.1.2. Federated Learning and Distributed AI
- Efficient Model Aggregation: Tensor factorization reduces the dimensionality of model updates, leading to faster communication in federated learning settings [111].
- Privacy-Preserving AI: Tensor-based representations enable secure and compressed data exchanges in privacy-sensitive applications such as healthcare and finance.
7.1.3. Explainable AI (XAI)
- Interpretable Neural Networks: Tensor factorization helps visualize and analyze neural network activations to understand model decisions [112].
- Bias Detection in AI: Tensor analysis can uncover hidden biases in machine learning models by analyzing multi-modal data distributions.
7.2. Tensor Decomposition in Neuroscience and Biomedical Engineering
7.2.1. Neuroimaging and Brain Signal Analysis
- Identifying Brain Networks: CP and Tucker decomposition are used to extract latent brain connectivity patterns from fMRI data.
- EEG Signal Classification: Tensor-based models improve classification accuracy in brain-computer interfaces (BCIs).
- Neurological Disease Diagnosis: Tensor factorization aids in detecting early markers of neurodegenerative diseases such as Alzheimer’s and Parkinson’s.
7.2.2. Personalized Medicine and Genomic Data Analysis
- Drug Discovery: Factorizing drug-response tensors helps in predicting personalized treatment outcomes [115].
- Multi-Omics Integration: Tensor methods combine genetic, transcriptomic, and proteomic data for a comprehensive understanding of diseases.
- Cancer Biomarker Identification: Decomposing patient gene expression tensors aids in identifying biomarkers for precision oncology.
7.3. Tensor Methods in Scientific Computing and Engineering
7.3.1. Computational Chemistry and Quantum Physics
- Quantum State Representation: Tensor network models provide efficient representations of quantum many-body states.
- Density Matrix Factorization: Low-rank tensor approximations help reduce computational complexity in quantum chemistry simulations.
- Quantum Machine Learning: Tensor-based learning methods optimize quantum circuit designs [117].
7.3.2. Climate Science and Geospatial Data Analysis
- Weather Prediction: Tensor factorization improves climate models by identifying temporal-spatial patterns.
- Remote Sensing: Decomposing hyperspectral satellite images aids in land cover classification and environmental monitoring.
- Disaster Forecasting: Tensor-based anomaly detection identifies patterns in extreme weather events such as hurricanes and wildfires [118].
7.4. Challenges and Future Directions in Tensor Decomposition Research
7.4.1. Scalability and Computational Efficiency
- Developing faster, scalable algorithms that can handle massive tensors efficiently [119].
- Integrating tensor decomposition with distributed computing frameworks for cloud-based implementations [120].
- Exploring hardware acceleration (e.g., GPU, TPU, and quantum computing) to improve computational speed [121].
7.4.2. Automated and Adaptive Tensor Factorization
- Adaptive rank estimation techniques to dynamically adjust tensor decomposition models.
- Bayesian and probabilistic tensor methods for uncertainty quantification in factorization.
- Learning-based approaches that use neural networks to optimize tensor decomposition.
7.4.3. Robustness and Generalization in Real-World Data
- Enhancing robustness to noise and missing data in tensor decomposition applications.
- Developing generalizable tensor models that work across different domains and datasets.
- Investigating adversarial robustness of tensor-based models in security-sensitive applications.
7.4.4. Integration with Emerging AI Paradigms
7.5. Conclusions
8. Conclusions
8.1. Key Takeaways
- Mathematical Formulations: Different tensor factorization techniques, including CP, Tucker, TT, and TR decompositions, provide versatile frameworks for analyzing multi-way data [126].
- Computational Trade-offs: While tensor methods offer powerful insights, computational complexity remains a major hurdle, necessitating efficient approximation algorithms.
- Emerging Applications: Tensors play a critical role in modern AI, deep learning, biomedical engineering, quantum computing, and geospatial data analysis [127].
- Theoretical Advancements: Ongoing research in uniqueness conditions, rank estimation, and robustness to noise continues to refine the mathematical understanding of tensors.
8.2. Challenges and Future Directions
- Scalability and Efficiency: Handling large-scale tensors requires advanced parallel computing techniques and optimized hardware implementations.
- Automated Model Selection: Determining optimal tensor rank and structure remains an open problem, necessitating adaptive and probabilistic approaches.
- Robustness and Interpretability: Ensuring the reliability of tensor-based models in noisy and adversarial settings is crucial for real-world deployment.
- Interdisciplinary Integration: Bridging the gap between tensor methods and fields such as quantum computing, algebraic geometry, and neuroscience presents exciting research opportunities [128].
8.3. Final Remarks
References
- Sedighin, F.; Cichocki, A. Image completion in embedded space using multistage tensor ring decomposition. Frontiers in Artificial Intelligence 2021, 4, 687176. [Google Scholar] [CrossRef] [PubMed]
- Davidson, I.; Gilpin, S.; Carmichael, O.; Walker, P. Network discovery via constrained tensor analysis of fmri data. In Proceedings of the ACM SIGKDD; 2013. [Google Scholar]
- Phan, A.H.; Cichocki, A. Tensor decompositions for feature extraction and classification of high dimensional datasets. Nonlinear Theory and its Applications, IEICE 2010, 1, 37–68. [Google Scholar] [CrossRef]
- Lao, N.; Mitchell, T.; Cohen, W.W. Random walk inference and learning in a large scale knowledge base. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2011, John Mcintyre Conference Centre, Edinburgh, Uk, A Meeting of Sigdat, A Special Interest Group of the ACL, 2012, 27-31 July 2011; pp. 529–539. [Google Scholar]
- Huckle, T.; Waldherr, K.; Schulte-Herbrüggen, T. Computations in quantum tensor networks. Linear Algebra and its Applications 2013, 438, 750–781. [Google Scholar] [CrossRef]
- Bondarenko, D.; Feldmann, P. Quantum autoencoders to denoise quantum data. Physical Review Letters 2020, 124, 130502. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 1–9.
- Suchanek, F.M.; Kasneci, G.; Weikum, G. Yago: A Core of Semantic Knowledge. In Proceedings of the Proc. WWW; 2007. [Google Scholar]
- Kim, Y.; Park, E.; Yoo, S.; Choi, T.; Yang, L.; Shin, D. Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications. In Proceedings of the ICLR; Bengio, Y.; LeCun, Y., Eds. 2016. [Google Scholar]
- Yang, X.; Gao, M.; Pu, J.; Nayak, A.; Liu, Q.; Emberton Bell, S.; Setter, J.O.; Cao, K.; Ha, H.; Kozyrakis, C.; et al. DNN Dataflow Choice Is Overrated. arXiv e-prints 2018, p.,.
- Weston, J.; Bordes, A.; Yakhnenko, O.; Usunier, N. Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction. In Proceedings of the Conference on Empirical Methods in Natural Language Processing; 2013; pp. 1366–1371. [Google Scholar]
- Park, C.; Lu, Y.; Saha, S.; Xue, T.; Guo, J.; Mojumder, S.; Apley, D.W.; Wagner, G.J.; Liu, W.K. Convolution hierarchical deep-learning neural network (c-hidenn) with graphics processing unit (gpu) acceleration. Computational Mechanics 2023, 72, 383–409. [Google Scholar] [CrossRef]
- Bordes, A.; Usunier, N.; García-Durán, A.; Weston, J.; Yakhnenko, O. Translating Embeddings for Modeling Multi-relational Data. In Proceedings of the NeurIPS; 2013; pp. 2787–2795. [Google Scholar]
- Vidal, G. Entanglement Renormalization: an introduction. In Understanding Quantum Phase Transitions; Carr, L.D., Ed.; Taylor & Francis, Boca Raton, 2010.
- Beylkin, G.; Mohlenkamp, M.J. Numerical operator calculus in higher dimensions. Proceedings of the National Academy of Sciences 2002, 99, 10246–10251. [Google Scholar] [CrossRef]
- Vervliet, N.; Debals, O.; De Lathauwer, L. Tensorlab 3. In 0—Numerical optimization strategies for large-scale constrained and coupled matrix/tensor factorization. In Proceedings of the 2016 50th Asilomar Conference on Signals, Systems and Computers. IEEE; 2016; pp. 1733–1738. [Google Scholar]
- Yang, K.; Wang, S.; Zhou, J.; Yoshimura, T. Energy-efficient scheduling method with cross-loop model for resource-limited CNN accelerator designs. In Proceedings of the Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS), May 2017, pp. 1–4. [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Lebedev, V.; Ganin, Y.; Rakhuba, M.; Oseledets, I.; Lempitsky, V. Speeding-up convolutional neural networks using fine-tuned CP-decomposition. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015-Conference Track Proceedings 2015. [Google Scholar]
- Zhou, Y.; Lentz, E.; Michelson, H.; Kim, C.; Baylis, K. Machine learning for food security: Principles for transparency and usability. Applied Economic Perspectives and Policy 2022, 44, 893–910. [Google Scholar] [CrossRef]
- Tung, F.; Mori, G. Similarity-Preserving Knowledge Distillation. In Proceedings of the ICCV. IEEE; 2019; pp. 1365–1374. [Google Scholar]
- Phan, A.H.; Sobolev, K.; Ermilov, D.; Vorona, I.; Kozyrskiy, N.; Tichavsky, P.; Cichocki, A. How to Train Unstable Looped Tensor Network. arXiv, arXiv:2203.02617 2022.
- Carroll, J.D.; Chang, J.J. Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika 1970, 35, 283–319. [Google Scholar] [CrossRef]
- HARSHMAN, R. Foundations of the parafac procedure: Models and conditions for an" explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics 1970, 16, 1–84. [Google Scholar]
- Denil, M.; Shakibi, B.; Dinh, L.; Ranzato, M.; De Freitas, N. Predicting parameters in deep learning. In Proceedings of the Advances in neural information processing systems; 2013; pp. 2148–2156. [Google Scholar]
- Oseledets, I. DMRG approach to fast linear algebra in the TT-format. Computational Methods in Applied Mathematics 2011, 11, 382–393. [Google Scholar] [CrossRef]
- Mizutani, E.; Dreyfus, S.E.; Nishio, K. On derivation of MLP backpropagation from the Kelley-Bryson optimal-control gradient formula and its application. In Proceedings of the Neural Networks, 2000. IJCNN 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IEEE, 2000, Vol. 2, pp. 167–172.
- LeCun, Y.; Chopra, S.; Hadsell, R.; Ranzato, M.; Huang, F. A tutorial on energy-based learning. Predicting Structured Data 2006, 1. [Google Scholar]
- Cheng, S.; Wang, L.; Xiang, T.; Zhang, P. Tree tensor networks for generative modeling. Physical Review B 2019, 99, 155131. [Google Scholar] [CrossRef]
- Yunpeng, C.; Xiaojie, J.; Bingyi, K.; Jiashi, F.; Shuicheng, Y. Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks. arXiv, arXiv:1703.02180 2017.
- Jose, C.; Cissé, M.; Fleuret, F. Kronecker Recurrent Units. In Proceedings of the ICML; 2018. [Google Scholar]
- Tucker, L.R. Some mathematical notes on three-mode factor analysis. Psychometrika 1966, 31, 279–311. [Google Scholar] [CrossRef]
- Deoras, A.; Kombrink, S.; et al. Empirical evaluation and combination of advanced language modeling techniques 2011.
- Deng, J.; Berg, A.; Satheesh, S.; Su, H.; Khosla, A.; Fei-Fei, L. ImageNet large scale visual recognition competition 2012 (ilsvrc2012), 2012.
- Pan, Y.; Xu, J.; Wang, M.; Ye, J.; Wang, F.; Bai, K.; Xu, Z. Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition. In Proceedings of the AAAI; 2019. [Google Scholar]
- Kiers, H.A. Towards a standardized notation and terminology in multiway analysis. Journal of Chemometrics: A Journal of the Chemometrics Society 2000, 14, 105–122. [Google Scholar] [CrossRef]
- Antol, S.; Agrawal, A.; Lu, J.; Mitchell, M.; Batra, D.; Zitnick, C.L.; Parikh, D. Vqa: Visual question answering. In Proceedings of the ICCV; 2015. [Google Scholar]
- Oseledets, I.V. Tensor-Train decomposition. SIAM Journal on Scientific Computing 2011, 33, 2295–2317. [Google Scholar] [CrossRef]
- Baez, J.; Stay, M. Physics, topology, logic and computation: a Rosetta Stone. Springer 2010, pp. New structures for physics, 95–172. [Google Scholar]
- Moreau, T.; Chen, T.; Vega, L.; Roesch, J.; Yan, E.; Zheng, L.; Fromm, J.; Jiang, Z.; Ceze, L.; Guestrin, C.; et al. A Hardware–Software Blueprint for Flexible Deep Learning Specialization. IEEE Micro 2019, 39, 8–16. [Google Scholar] [CrossRef]
- Suzuki, M. Generalized Trotter’s formula and systematic approximants of exponential operators and inner derivations with applications to many-body problems. Communications in Mathematical Physics 1976, 51, 183–190. [Google Scholar] [CrossRef]
- Zhao, Q.; Zhou, G.; Xie, S.; Zhang, L.; Cichocki, A. Tensor ring decomposition. arXiv preprint arXiv:1606.05535, arXiv:1606.05535 2016.
- Liu, D.; Yang, L.T.; Wang, P.; Zhao, R.; Zhang, Q. TT-TSVD: A Multi-modal Tensor Train Decomposition with Its Application in Convolutional Neural Networks for Smart Healthcare. TOMM 2022, 18, 1–17. [Google Scholar] [CrossRef]
- Ma, X.; Zhang, P.; Zhang, S.; Duan, N.; Hou, Y.; Zhou, M.; Song, D. A tensorized transformer for language modeling. Advances in neural information processing systems 2019, 32. [Google Scholar]
- De Lathauwer, L.; Castaing, J.; Cardoso, J.F. Fourth-order cumulant-based blind identification of underdetermined mixtures. IEEE Transactions on Signal Processing 2007, 55, 2965–2973. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in neural information processing systems; 2012; pp. 1097–1105. [Google Scholar]
- Lloyd, S.; Schuld, M.; Ijaz, A.; Izaac, J.; Killoran, N. Quantum embeddings for machine learning. arXiv, arXiv:2001.03622 2020.
- Hou, M.; Tang, J.; Zhang, J.; Kong, W.; Zhao, Q. Deep multimodal multilinear fusion with high-order polynomial pooling. NeurIPS 2019. [Google Scholar]
- Comon, P. Tensors: a brief introduction. IEEE Signal Processing Magazine 2014, 31, 44–53. [Google Scholar] [CrossRef]
- Zhu, J.; Jiang, J.; Chen, X.; Tsui, C. SparseNN: An energy-efficient neural network accelerator exploiting input and output sparsity. In Proceedings of the Proceedings of the Design, Automation Test in Europe Conference Exhibition (DATE), March 2018, pp. 241–244. [CrossRef]
- Zhe, S.; Qi, Y.; Park, Y.; Xu, Z.; Molloy, I.; Chari, S. DinTucker: Scaling Up Gaussian Process Models on Large Multidimensional Arrays. In Proceedings of the Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, February 12-17, 2016, Phoenix, Arizona, USA 2016, pp. 2386–2392.
- Wang, B.; Ren, Y.; Shang, L.; Jiang, X.; Liu, Q. Exploring Extreme Parameter Compression for Pre-Trained Language Models. In Proceedings of the ICLR; 2022. [Google Scholar]
- Dahl, G.E.; Sainath, T.N.; Hinton, G.E. Improving deep neural networks for LVCSR using rectified linear units and dropout. In Proceedings of the ICASSP; 2013. [Google Scholar]
- Auer, S.; Bizer, C.; Kobilarov, G.; Lehmann, J.; Ives, Z. DBpedia: A Nucleus for a Web of Open Data. In Proceedings of the Proc. ISWC; 2007; pp. 11–15. [Google Scholar]
- Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. SIAM review 2009, 51, 455–500. [Google Scholar] [CrossRef]
- Socher, R.; Chen, D.; Manning, C.D.; Ng, A.Y. Reasoning With Neural Tensor Networks for Knowledge Base Completion. In Proceedings of the NeurIPS; 2013; pp. 926–934. [Google Scholar]
- Zhu, J. Max-margin nonparametric latent feature models for link prediction. In Proceedings of the Proceedings of the 29th International Conference on International Conference on Machine Learning. Omnipress, 2012, pp. 1179–1186.
- Cover, T.M. Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition. IEEE Transactions on Electronic Computers, 1965, EC-14, 326–334. [CrossRef]
- Lao, N.; Cohen, W.W. Relational Retrieval Using a Combination of Path-constrained Random Walks. Machine Learning 2010, 81, 53–67. [Google Scholar] [CrossRef]
- Liu, Y.; Ng, M.K. Deep neural network compression by Tucker decomposition with nonlinear response. Knowledge-Based Systems 2022.
- Wang, Q.F.; Cambria, E.; Liu, C.L.; Hussain, A. Common sense knowledge for handwritten chinese text recognition. Cognitive Computation 2013, 5, 234–242. [Google Scholar] [CrossRef]
- Hrinchuk, O.; Khrulkov, V.; Mirvakhabova, L.; Orlova, E.; Oseledets, I. Tensorized embedding layers for efficient model compression. arXiv, arXiv:1901.10787 2019.
- Werbos, P.J. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE 1990, 78, 1550–1560. [Google Scholar] [CrossRef]
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the CVPR; 2017. [Google Scholar]
- Cichocki, A. Tensor Networks for Dimensionality Reduction, Big Data and Deep Learning. In Advances in Data Analysis with Computational Intelligence Methods; Springer, 2018; pp. 3–49.
- Novikov, A.; Podoprikhin, D.; Osokin, A.; Vetrov, D.P. Tensorizing neural networks. In Proceedings of the NeurIPS; 2015. [Google Scholar]
- Chen, S.; Zhou, J.; Sun, W.; Huang, L. Joint Matrix Decomposition for Deep Convolutional Neural Networks Compression. arXiv, arXiv:2107.04386 2021.
- Sobolev, K.; Ermilov, D.; Phan, A.H.; Cichocki, A. PARS: Proxy-Based Automatic Rank Selection for Neural Network Compression via Low-Rank Weight Approximation. Mathematics 2022, 10, 3801. [Google Scholar] [CrossRef]
- Viebke, A.; Memeti, S.; Pllana, S.; Abraham, A. Chaos: a parallelization scheme for training convolutional neural networks on intel xeon phi. The Journal of Supercomputing 2017, pp. 1–31.
- Xu, Z.; Yan, F.; Qi, Y.A. Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis. In Proceedings of the Proceedings of the 29th International Conference on Machine Learning, ICML 2012, Edinburgh, Scotland, UK, 2012, 2012., June 26 - July 1.
- Tu, F.; Yin, S.; Ouyang, P.; Tang, S.; Liu, L.; Wei, S. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2017, 25, 2220–2233. [Google Scholar] [CrossRef]
- Bengio, Y.; Boulanger-Lewandowski, N.; Pascanu, R. Advances in optimizing recurrent networks. In Proceedings of the ICASSP. IEEE; 2013. [Google Scholar]
- Lu, L.; Jin, P.; Pang, G.; Zhang, Z.; Karniadakis, G.E. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature machine intelligence 2021, 3, 218–229. [Google Scholar] [CrossRef]
- Schuch, N.; Wolf, M.M.; Verstraete, F.; Cirac, J.I. Computational complexity of projected entangled pair states. Physical Review Letters 2007, 98, 140506. [Google Scholar] [CrossRef] [PubMed]
- Chen, J.; Cheng, S.; Xie, H.; Wang, L.; Xiang, T. Equivalence of restricted Boltzmann machines and tensor network states. Physical Review B 2018, 97, 085104. [Google Scholar] [CrossRef]
- You, Y.; Buluç, A.; Demmel, J. Scaling deep learning on GPU and knights landing clusters. In Proceedings of the Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, p.
- Do, T.; Do, T.T.; Tran, H.; Tjiputra, E.; Tran, Q.D. Compact trilinear interaction for visual question answering. In Proceedings of the ICCV; 2019. [Google Scholar]
- Huang, S.; Xu, Z.; Lv, J. Adaptive local structure learning for document co-clustering. Knowl.-Based Syst. 2018, 148, 74–84. [Google Scholar] [CrossRef]
- Kossaifi, J.; Bulat, A.; Tzimiropoulos, G.; Pantic, M. T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor. In Proceedings of the CVPR; 2019. [Google Scholar]
- Li, S.; Pan, F.; Zhou, P.; Zhang, P. Boltzmann machines as two-dimensional tensor networks. Physical Review B 2021, 104, 075154. [Google Scholar] [CrossRef]
- Zhang, P.; Su, Z.; Zhang, L.; Wang, B.; Song, D. A quantum many-body wave function inspired language modeling approach. In Proceedings of the CIKM; 2018. [Google Scholar]
- Cao, X.; Wei, X.; Han, Y.; Lin, D. Robust face clustering via tensor decomposition. IEEE transactions on cybernetics 2014, 45, 2546–2557. [Google Scholar] [CrossRef]
- OpenAI. GPT-4 Technical Report. arXiv, arXiv:2303.08774 2023.
- Zniyed, Y.; Nguyen, T.P.; et al. Efficient tensor decomposition-based filter pruning. Neural Networks 2024, 178, 106393. [Google Scholar]
- Håstad, J. Tensor rank is NP-complete. Journal of Algorithms 1990, 11, 644–654. [Google Scholar] [CrossRef]
- Denton, E.L.; Zaremba, W.; Bruna, J.; LeCun, Y.; Fergus, R. Exploiting linear structure within convolutional networks for efficient evaluation. In Proceedings of the Advances in neural information processing systems; 2014; pp. 1269–1277. [Google Scholar]
- Wang, D.; Wu, B.; Zhao, G.S.; Chen, H.; Deng, L.; Yan, T.; Li, G. Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs. IEEE Trans. NNLS 2021, PP. [Google Scholar] [CrossRef]
- Nakajima, S.; Tomioka, R.; Sugiyama, M.; Babacan, S.D. Perfect Dimensionality Recovery by Variational Bayesian PCA. In Proceedings of the NeurIPS; 2012. [Google Scholar]
- Molchanov, P.; Mallya, A.; Tyree, S.; Frosio, I.; Kautz, J. Importance Estimation for Neural Network Pruning. In Proceedings of the CVPR; 2019. [Google Scholar]
- Liu, J.; Luo, J.; Shah, M. Recognizing realistic actions from videos “in the wild”. In Proceedings of the 2009 IEEE conference on computer vision and pattern recognition. IEEE; 2009; pp. 1996–2003. [Google Scholar]
- Qi, G.; Sun, Y.; Gao, J.; Hu, Y.; Li, J. Matrix variate restricted Boltzmann machine. In Proceedings of the IJCNN; 2016. [Google Scholar]
- Torlai, G.; Timar, B.; Van Nieuwenburg, E.P.; Levine, H.; Omran, A.; Keesling, A.; Bernien, H.; Greiner, M.; Vuletić, V.; Lukin, M.D.; et al. Integrating neural networks with a quantum simulator for state reconstruction. Physical Review Letters 2019, 123, 230504. [Google Scholar] [CrossRef]
- Rodrigues, C.F.; Riley, G.; Luján, M. Exploration of Task-based Scheduling for Convolutional Neural Networks Accelerators Under Memory Constraints. In Proceedings of the Proceedings of the ACM International Conference on Computing Frontiers, New York, NY, USA, 2019; CF ’19, pp. 366–372. [CrossRef]
- ten Berge, J.M. The typical rank of tall three-way arrays. Psychometrika 2000, 65, 525–532. [Google Scholar] [CrossRef]
- Chen, S.; Lyu, M.R.; King, I.; Xu, Z. Exact and stable recovery of pairwise interaction tensors. In Proceedings of the NeurIPS; 2013; pp. 1691–1699. [Google Scholar]
- Eisert, J. Entanglement and tensor network states. Modelling and Simulation 2013, 3. [Google Scholar]
- Sedighin, F.; Cichocki, A.; Yokota, T.; Shi, Q. Matrix and tensor completion in multiway delay embedded space using tensor train, with application to signal reconstruction. IEEE Signal Processing Letters 2020, 27, 810–814. [Google Scholar] [CrossRef]
- Chen, Y.; Yang, T.; Emer, J.; Sze, V. Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 2019, 9, 292–308. [Google Scholar] [CrossRef]
- Rendle, S. Factorization machines. In Proceedings of the ICDM; 2010. [Google Scholar]
- Kak, S.C. Quantum neural computing. Advances in Imaging and Electron Physics 1995, 94, 259–313. [Google Scholar]
- Meurice, Y.; Osborn, J.C.; Sakai, R.; Unmuth-Yockey, J.; Catterall, S.; Somma, R.D. Tensor networks for High Energy Physics: contribution to Snowmass 2021. arXiv 2022, arXiv:2203.04902.
- Yin, M.; Liao, S.; Liu, X.; Wang, X.; Yuan, B. Towards Extremely Compact RNNs for Video Recognition With Fully Decomposed Hierarchical Tucker Structure. In Proceedings of the CVPR; 2021. [Google Scholar]
- Zhang, Z.; Allen, G.I.; Zhu, H.; Dunson, D. Tensor network factorizations: Relationships between brain structural connectomes and traits. Neuroimage 2019, 197, 330–343. [Google Scholar] [CrossRef]
- Jouppi, N.P.; Young, C.; Patil, N.; Patterson, D.; Agrawal, G.; Bajwa, R.; Bates, S.; Bhatia, S.; Boden, N.; Borchers, A.; et al. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), June 2017, pp. 1–12. [CrossRef]
- Liu, X.; Su, J.; Huang, F. Tuformer: Data-Driven Design of Expressive Transformer by Tucker Tensor Representation. In Proceedings of the ICLR; 2022. [Google Scholar]
- Chen, Y.; Emer, J.; Sze, V. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks. In Proceedings of the Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), June 2016, pp. 367–379. [CrossRef]
- Han, S.; Liu, X.; Mao, H.; Pu, J.; Pedram, A.; Horowitz, M.A.; Dally, W.J. EIE: Efficient Inference Engine on Compressed Deep Neural Network. In Proceedings of the Proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA), June 2016, pp. 243–254. [CrossRef]
- Parhi, K.K.; Unnikrishnan, N.K. Brain-inspired computing: Models and architectures. IEEE Open Journal of Circuits and Systems 2020, 1, 185–204. [Google Scholar] [CrossRef]
- Carrasquilla, J.; Torlai, G.; Melko, R.G.; Aolita, L. Reconstructing quantum states with generative models. Nature Machine Intelligence 2019, 1, 155–161. [Google Scholar] [CrossRef]
- Huggins, W.; Patil, P.; Mitchell, B.; Whaley, K.B.; Stoudenmire, E.M. Towards quantum machine learning with tensor networks. Quantum Science and technology 2019, 4, 024001. [Google Scholar] [CrossRef]
- Shashua, A.; Levin, A. Linear image coding for regression and classification using the tensor-rank principle. In Proceedings of the CVPR. IEEE 2001, Vol. 1, pp. I–42. 2001. [Google Scholar]
- Liu, H.; Singh, P. ConceptNet – A Practical Commonsense Reasoning Tool-Kit. BT Technology Journal 2004, 22, 211–226. [Google Scholar] [CrossRef]
- Li, Q.; Wang, B.; Melucci, M. CNM: An Interpretable Complex-valued Network for Matching. In Proceedings of the NAACL; 2019. [Google Scholar]
- Valdez, F.; Melin, P. A review on quantum computing and deep learning algorithms and their applications. Soft Computing 2022, pp. 1–20.
- Han, S.; Mao, H.; Dally, W.J. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv arXiv:1510.00149, arXiv:1510.00149 2015.
- Tjandra, A.; Sakti, S.; Nakamura, S. Recurrent Neural Network Compression Based on Low-Rank Tensor Representation. IEICE Trans. Inf. Syst. 2020, 103-D, 435–449.
- Hubara, I.; Courbariaux, M.; Soudry, D.; El-Yaniv, R.; Bengio, Y. Binarized neural networks. In Proceedings of the NeurIPS; 2016; pp. 4107–4115. [Google Scholar]
- Novikov, A.; Podoprikhin, D.; Osokin, A.; Vetrov, D.P. Tensorizing neural networks. Advances in Neural Information Processing Systems 2015, 28. [Google Scholar]
- Chen, Y.; Luo, T.; Liu, S.; Zhang, S.; He, L.; Wang, J.; Li, L.; Chen, T.; Xu, Z.; Sun, N.; et al. DaDianNao: A Machine-Learning Supercomputer. In Proceedings of the Proceeding of the IEEE/ACM International Symposium on Microarchitecture (MICRO), Dec 2014; pp. 609–622. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on t Classification. In Proceedings of the ICCV; 2015. [Google Scholar]
- Arute, F.; Arya, K.; Babbush, R.; Bacon, D.; Bardin, J.C.; Barends, R.; Biswas, R.; Boixo, S.; Brandao, F.G.; Buell, D.A.; et al. Quantum supremacy using a programmable superconducting processor. Nature 2019, 574, 505–510. [Google Scholar] [CrossRef] [PubMed]
- Zhang, G.; Gheorghe, M.; Li, Y. A membrane algorithm with quantum-inspired subalgorithms and its application to image processing. Natural Computing 2012, 11, 701–717. [Google Scholar] [CrossRef]
- Xin, M.; Wang, Y. Research on image classification model based on deep convolution neural network. EURASIP J. Image Video Process. 2019, 2019, 40. [Google Scholar] [CrossRef]
- Zniyed, Y.; Nguyen, T.P.; et al. Enhanced network compression through tensor decompositions and pruning. IEEE Transactions on Neural Networks and Learning Systems 2024. [Google Scholar]
- Xu, Z.; Yan, F.; Qi, Y. Infinite Tucker decomposition: nonparametric Bayesian models for multiway data analysis. In Proceedings of the ICML; 2012. [Google Scholar]
- Kressner, D.; Tobler, C. htucker—A MATLAB toolbox for tensors in hierarchical Tucker format. Mathicse, EPF Lausanne 2012.
- Gao, H.; Sun, L.; Wang, J.X. PhyGeoNet: Physics-informed geometry-adaptive convolutional neural networks for solving parameterized steady-state PDEs on irregular domain. Journal of Computational Physics 2021, 428, 110079. [Google Scholar] [CrossRef]
- Fukui, A.; Park, D.H.; Yang, D.; Rohrbach, A.; Darrell, T.; Rohrbach, M. Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding. In Proceedings of the EMNLP; 2016. [Google Scholar]
| Method | Uniqueness | Scalability | Compression |
|---|---|---|---|
| CP | High | Moderate | Low |
| Tucker | Moderate | Moderate | High |
| TT | Low | High | Very High |
| TR | Low | High | Very High |
| Tensor Decomposition | Application | Domain |
|---|---|---|
| CP Decomposition | Blind Source Separation | Signal Processing |
| Tucker Decomposition | Feature Extraction | Machine Learning |
| Tensor Train | Neural Network Compression | Deep Learning |
| Tensor Ring | Large-Scale Data Compression | High-Performance Computing |
| Sparse Tensor Factorization | Recommender Systems | Machine Learning |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
