西安电子科技大学学报 ›› 2025, Vol. 52 ›› Issue (3): 232-241.doi: 10.19665/j.issn1001-2400.20250103

• 第二十七届中国科协年会——AI时代网络技术创新 • 上一篇    下一篇

多方法融合的卷积神经网络模型压缩方法

郭开泰(), 李宇哲(), 付东豪(), 郑洋(), 任胜寒(), 胡海虹(), 梁继民()   

  1. 西安电子科技大学 电子工程学院,陕西 西安 710071

Convolutional neural network model compression via the integrated multimethod approach

GUO Kaitai(), LI Yuzhe(), FU Donghao(), ZHENG Yang(), REN Shenghan(), HU Haihong(), LIANG Jimin()   

  1. School of Electronic Engineering,Xidian University,Xi’an 710071,China
  • Received:2024-06-12 Online:2025-06-20 Published:2025-01-14

摘要:

卷积神经网络在实际应用中的计算和存储成本较高,因此模型压缩技术成为部署此类模型的关键。然而,单一压缩技术通常会导致性能下降、泛化能力降低或计算复杂度增加的问题。为此提出了一种融合模型剪枝、知识蒸馏和模型量化的压缩框架。首先通过稀疏化训练对模型进行剪枝,减少冗余通道;随后,以原始模型作为教师网络,利用知识蒸馏方法对剪枝后的学生网络进行指导,提升压缩模型的性能;最后采用模型量化技术对压缩后的网络进一步优化以提高其适用性。利用卷积网络中的分类模型和目标检测模型对所提出方法进行测试,实验结果表明,该模型压缩框架能够有效降低模型的存储和计算需求,在多个测试模型上,模型大小缩减幅度超过90%,推理速度提升3~4倍,同时精度损失控制在2%以内。提出的多方法融合的模型压缩框架在保证卷积神经网络模型性能的同时,减少了模型大小,提升了推理速度,适用于资源受限环境中卷积神经网络的高效部署。

关键词: 模型压缩, 卷积神经网络, 模型剪枝, 知识蒸馏, 模型量化

Abstract:

Convolutional neural networks require substantial computational and storage resources in practical applications,which renders model compression essential for efficient deployment.However,single compression techniques often lead to performance degradation,reduced generalization ability,or increased computational complexity.To address these issues,this paper proposes a compression framework that integrates model pruning,knowledge distillation,and model quantization.It starts with sparsity training to prune redundant channels,continues by leveraging the original model as a teacher to distill knowledge into the pruned student network,and concludes with quantization training to further optimize the compressed model’s usability.The proposed method was tested on classification and object detection models in convolutional networks.Experimental results demonstrate that the framework effectively reduces storage and computation requirements.Model sizes are reduced by over 90%,inference speeds increases by 3-4 times,and accuracy loss is controlled within 2%.The proposed multi-method compression framework ensures the performance of convolutional neural networks while reducing the model size and improving the inference speed,and is suitable for efficient deployment in resource-constrained environments.

Key words: model compression, convolutional neural networks, model pruning, knowledge distillation, model quantization

中图分类号: 

  • TP391