A Collaborative Multi-Compression Acceleration Mechanism for Neural Networks in Keyword Spotting

Junbang Jiang; Rui Pu; Jin Li; Man Zhu

doi:10.20944/preprints202605.1063.v1

Submitted:

15 May 2026

Posted:

15 May 2026

You are already at the latest version

Abstract

To address the large model size, high computational cost, and limited deployment re-sources of keyword spotting models on edge platforms, this study proposes a collaborative multi-compression acceleration framework for lightweight deployment. Built on an end-to-end convolutional neural network for keyword spotting, the framework integrates adaptive structured pruning, hardware-friendly mixed-precision dynamic quantization, and quantization-aware multi-stage knowledge distillation into a unified compression pipeline. To eliminate the influence of inconsistent training budgets and data partitions across different compression branches, the results of quantization, pruning, distillation, and joint compression are reorganized under a unified evaluation protocol with mul-ti-seed mean ± std reporting. Under this protocol, the retrained baseline reaches 97.13% ± 0.85. Experimental results show that, in the quantization branch, MPDQ achieves 95.78% ± 1.69 with a compression ratio of 9.56×, demonstrating the most favorable balance be-tween accuracy and storage efficiency; in the pruning branch, AIASP reaches 95.63% ± 0.67 at 30% sparsity with a compression ratio of 1.43×, indicating a balanced compromise between accuracy retention and stability; in the distillation branch, PMKD, Multi-Teacher KD, and Fixed-T KD achieve 96.81% ± 0.69, 95.99% ± 1.18, and 96.70% ± 0.74, respectively, showing that the student model can maintain strong recognition performance under ap-proximately 4× structural compression; and the final joint compression scheme reaches 96.16% ± 0.53 with a trade-off score of 4.26 at a compression ratio of 9.89×. These results indicate that the main advantage of collaborative multi-compression lies in achieving a more balanced optimization among accuracy, model size, and compression efficiency un-der stringent deployment constraints.

Keywords:

keyword spotting

;

neural network

;

model compression

;

mixed-precision quantization

;

structured pruning

;

knowledge distillation

Subject:

Computer Science and Mathematics - Artificial Intelligence and Machine Learning

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

A Collaborative Multi-Compression Acceleration Mechanism for Neural Networks in Keyword Spotting

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe