Haiquan Lu(卢海泉)
Biography
Hi! I am a final-yeal Computer Science Undergraduate student at Nankai University, where I am fortunately advised by Prof. Ming-Ming Cheng and Xialei Liu in Media Computing Lab (VCIP). I am working closely with Prof. Yaoqing Yang at Dartmouth College as a research intern, and with Prof. Michael Mahoney and Prof. Kurt Keutzer at UC Berkeley and ICSI.
My research focuses on Mechine Learning and its real-world applications. I am currently working on improving efficiency, transparency, and robustness of mechine learning models, utilizing high-dimension features such as loss landscapes, weight matrix analysis. This research contributes to AlphaPruning and SharpBalance.
Selected Publications
- AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models
Haiquan Lu, Yefan Zhou, Shiwei Liu, Zhangyang Wang, Michael W. Mahoney, Yaoqing Yang
AlphaPruning proposed an approach to analyze model layer quality, automatically allocate compression factors (such as sparsity or quantization precision) to transformer layers, applicable to LLaMA, OPT, Mistral, ViT, and ConvNext. Pruning LLaMA-7B to 80% sparsity leads to 3.06× end-to-end speedup on CPUs. It also improves the structured/semi-structured pruning and mix-precision quantization. - Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
Haiquan Lu, Xiaotian Liu, Yefan Zhou, Qunli Li, Kurt Keutzer, Michael W. Mahoney, Yujun Yan, Huanrui Yang, Yaoqing Yang
We discover sharpness-diversity trade-off: minimizing the sharpness in the loss landscape tends to diminish the diversity of members within the ensemble, adversely affecting the ensemble’s improvement. We introduce a new ensemble training method called SharpBalance to address it. - Early Preparation Pays Off: New Classifier Pre-tuning for Class Incremental Semantic Segmentation
Zhengyuan Xie, Haiquan Lu, Jia-wen Xiao, Enguang Wang, Le Zhang, Xialei Liu
Class incremental semantic segmentation aims to preserve old knowledge while learning new tasks, however, impeded by catastrophic forgetting and background shift issues. Prior works indicate the pivotal importance of initializing new classifiers and mainly focus on transferring knowledge from the background classifier or preparing classifiers for future classes, neglecting the alignment and variance of new classifiers. In this paper, we propose a new classifier pre-tuning (NeST) method applied before the formal training process, learning a transformation from old classifiers to generate new classifiers for initialization rather than directly tuning the parameters of new classifiers.