Song Wang

I am a Ph.D. student in the College of Computer Science and Technology at Zhejiang University, advised by Prof. Jianke Zhu. Prior to that, I obtained my B.Eng. at Zhejiang University, supervised by Prof. Zhiwei Xu and Prof. Hangfang Zhao. And now, I am also a visiting Ph.D. student @ xML-Lab, National University of Singapore, co-advised by Prof. Xinchao Wang.

I have broad research interests in computer vision and machine learning. My current research focuses on:

Post-training for multi-modal large language models (MLLM) along with efficient and agentic reasoning.

Pre-training and parameter-efficient fine-tuning (PEFT) for multi-modal perception models.

Vision-centric autonomous driving including occupancy network and semantic map construction.

Various forms of academic collaboration and discussion are welcome. Feel free to reach out!

Email / Google Scholar / Github

Publications

* indicates equal contribution

PixelThink: Towards Efficient Chain-of-Pixel Reasoning
Song Wang, Gongfan Fang, Lingdong Kong, Xiangtai Li, Jianyun Xu, Sheng Yang, Qiang Li, Jianke Zhu, Xinchao Wang
arXiv, 2025
[arXiv] [Code] [Project Page]

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
Sicheng Feng*, Song Wang*, Shuyi Ouyang, Lingdong Kong, Zikai Song, Jianke Zhu, Huan Wang, Xinchao Wang
arXiv, 2025
[arXiv] [Code] [Project Page] [中文解读]

SAM4D: Segment Anything in Camera and LiDAR Streams
Jianyun Xu*, Song Wang*, Ziqian Ni*, Chunyong Hu, Sheng Yang, Jianke Zhu, Qiang Li
IEEE/CVF International Conference on Computer Vision (ICCV), 2025
[arXiv] [Code] [Project Page]

PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Song Wang, Xiaolu Liu, Lingdong Kong, Jianyun Xu, Chunyong Hu, Gongfan Fang, Wentong Li, Jianke Zhu, Xinchao Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[arXiv] [Code]

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
Hanxun Yu*, Wentong Li*, Song Wang, Junbo Chen, Jianke Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR Highlight), 2025
[arXiv] [Code]

Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction
Xiaolu Liu*, Ruizi Yang*, Song Wang, Wentong Li, Junbo Chen, Jianke Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[arXiv] [Code]

Reliable and Calibrated Semantic Occupancy Prediction by Hybrid Uncertainty Learning
Song Wang, Zhongdao Wang, Jiawei Yu, Wentong Li, Bailan Feng, Junbo Chen, Jianke Zhu
International Joint Conference on Artificial Intelligence (IJCAI), 2025
[arXiv] [Code]

PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
Qijun Gan, Song Wang, Shengtao Wu, Jianke Zhu
International Conference on Learning Representations (ICLR Spotlight), 2025
[arXiv] [Code]

TokenPacker: Efficient Visual Projector for Multimodal LLM
Wentong Li*, Yuqian Yuan*, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang
International Journal of Computer Vision (IJCV), 2025
[arXiv] [Code]

Offboard Occupancy Refinement with Hybrid Propagation for Autonomous Driving
Hao Shi*, Song Wang*, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang
IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2025
[arXiv] [Code]

Label-efficient Semantic Scene Completion With Scribble Annotations
Song Wang, Jiawei Yu, Wentong Li, Hao Shi, Kailun Yang, Junbo Chen, Jianke Zhu
International Joint Conference on Artificial Intelligence (IJCAI), 2024
[arXiv] [Code]

Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Song Wang, Jiawei Yu, Wentong Li, Wenyu Liu, Xiaolu Liu, Junbo Chen, Jianke Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[arXiv] [Code]

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
Xiaolu Liu, Song Wang, Wentong Li, Ruizi Yang, Junbo Chen, Jianke Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[arXiv] [Code]

Domain Adaptation Transformer for Unsupervised Driving-Scene Segmentation in Adverse Conditions
Wenyu Liu, Song Wang, Jianke Zhu, Xuansong Xie, Lei Zhang
IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2024
[Paper] [Code]

DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction
Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang
IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2024
[arXiv] [Code]

Label-efficient Segmentation via Affinity Propagation
Wentong Li*, Yuqian Yuan*, Song Wang, Wenyu Liu, Dongqi Tang, Jian Liu, Jianke Zhu, Lei Zhang
Conference on Neural Information Processing Systems (NeurIPS), 2023
[arXiv] [Code] [Project Page]

Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport
Wentong Li, Yuqian Yuan, Song Wang, Jianke Zhu, Jianshu Li, Jian Liu, Lei Zhang
IEEE/CVF International Conference on Computer Vision (ICCV), 2023
[arXiv] [Code]

LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
Song Wang, Wentong Li, Wenyu Liu, Xiaolu Liu, Jianke Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
[arXiv] [Code]

Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation
Song Wang, Jianke Zhu, Ruixiang Zhang
IEEE Robotics and Automation Letters (RA-L with IROS, IF: 5.2), 2022
[arXiv] [Code]

Honors and Awards

Chen Tianzhou Scholarship, Zhejiang University

China's Optics Valley Scholarship, Donghu New Technology Development Zone

Graduate with Merit A Performance, Zhejiang University

Third Prize in China Graduate AI Innovation Competition, Ministry of Education

Award of Honor for Graduate, Zhejiang University

Outstanding Academic Scholarship, Zhejiang University

Outstanding Undergraduate Award, Zhejiang University

Zhejiang Provincial Government Scholarship, Zhejiang Province

Zhongtian Technology First-Class Scholarship, ZTT Group

Zhejiang University Scholarship - Second Prize, Zhejiang University

First Prize in Advanced Mathematics Competition, Zhejiang Province

Academic Services

Conference Reviewer: ICML 2025, IJCAI 2025, CVPR 2025, ICLR 2025, IROS 2025, NeurIPS D&B Track 2024-2025, NeurIPS 2024, ECCV 2024, ACM MM 2023-2024, ICRA 2024-2025, SynData4CV@CVPR 2024

Journal Reviewer: IEEE T-IP, IEEE T-ITS, IEEE T-CSVT, IEEE RA-L, IEEE T-IV

Website Template