TY - JOUR
T1 - A min–max optimization framework for sparse multi-task deep neural network
AU - Guo, Jiacheng
AU - Li, Lei
AU - Sun, Huiming
AU - Qin, Minghai
AU - Yu, Hongkai
AU - Zhang, Tianyun
PY - 2025/10/14
Y1 - 2025/10/14
N2 - Multi-task learning is a subfield of machine learning in which the data is trained with a shared model to solve different tasks simultaneously. Instead of training multiple models, we only need to train a single model with shared parameters to solve different tasks. By sharing parameters, multi-task learning significantly decreases the number of parameters and reduces computational and storage requirements. However, when applying multi-task learning to deep neural networks, model size remains a challenge, particularly for edge platforms. Compressing multi-task models while maintaining performance across all tasks is another significant challenge. To address these issues, we propose a min–max optimization framework for highly compressed multi-task deep neural network models, combined with weight pruning or dynamic sparse training strategies to improve training efficiency by reducing model parameters. Specifically, weight pruning leverages reweighted l1 pruning method, enabling high pruning rates while preserving the performance across all tasks. Dynamic sparse training, on the other hand, initializes and updates the sparse masks of the network dynamically during the training process while maintaining the same number of weights, which typically encourages sparsity in the weight matrices with the advantage of reducing memory footprint and computational requirements. Our proposed min–max optimization framework can automatically adjust the learnable weighting factors between different tasks, ensuring optimization for the worst-performing task. Experimental results on NYUv2 and CIFAR-100 datasets demonstrate that the model incurs minor performance degradation after pruning with the min–max framework. Further analyses indicate the min–max framework has reliable performance and the difference from prior methods is statistically significant. The proposed dynamic sparse multi-task framework reaches around 2% overall precision improvement using min–max optimization compared with prior methods when the models are equally sparsed.
AB - Multi-task learning is a subfield of machine learning in which the data is trained with a shared model to solve different tasks simultaneously. Instead of training multiple models, we only need to train a single model with shared parameters to solve different tasks. By sharing parameters, multi-task learning significantly decreases the number of parameters and reduces computational and storage requirements. However, when applying multi-task learning to deep neural networks, model size remains a challenge, particularly for edge platforms. Compressing multi-task models while maintaining performance across all tasks is another significant challenge. To address these issues, we propose a min–max optimization framework for highly compressed multi-task deep neural network models, combined with weight pruning or dynamic sparse training strategies to improve training efficiency by reducing model parameters. Specifically, weight pruning leverages reweighted l1 pruning method, enabling high pruning rates while preserving the performance across all tasks. Dynamic sparse training, on the other hand, initializes and updates the sparse masks of the network dynamically during the training process while maintaining the same number of weights, which typically encourages sparsity in the weight matrices with the advantage of reducing memory footprint and computational requirements. Our proposed min–max optimization framework can automatically adjust the learnable weighting factors between different tasks, ensuring optimization for the worst-performing task. Experimental results on NYUv2 and CIFAR-100 datasets demonstrate that the model incurs minor performance degradation after pruning with the min–max framework. Further analyses indicate the min–max framework has reliable performance and the difference from prior methods is statistically significant. The proposed dynamic sparse multi-task framework reaches around 2% overall precision improvement using min–max optimization compared with prior methods when the models are equally sparsed.
KW - Dynamic sparse training
KW - Model compression
KW - Multi-task learning
KW - Weight pruning
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105009879465&origin=inward
UR - https://www.scopus.com/inward/citedby.uri?partnerID=HzOxMe3b&scp=105009879465&origin=inward
U2 - 10.1016/j.neucom.2025.130865
DO - 10.1016/j.neucom.2025.130865
M3 - Article
SN - 0925-2312
VL - 650
JO - Neurocomputing
JF - Neurocomputing
M1 - 130865
ER -