Song Wang

I am a Ph.D. student in the College of Computer Science and Technology at Zhejiang University, advised by Prof. Jianke Zhu. Prior to that, I obtained my B.Eng. at Zhejiang University, supervised by Prof. Zhiwei Xu and Prof. Hangfang Zhao. And now, I am also a visiting Ph.D. student @ xML-Lab, National University of Singapore, co-advised by Prof. Xinchao Wang.

I have broad research interests in computer vision and machine learning. My current research focuses on:

  • Post-training for multi-modal large language models (MLLM) along with efficient and agentic reasoning.
  • Pre-training and parameter-efficient fine-tuning (PEFT) for multi-modal perception models.
  • Vision-centric autonomous driving including occupancy network and semantic map construction.
  • Various forms of academic collaboration and discussion are welcome. Feel free to reach out!

    Email  /  Google Scholar  /  Github

    profile photo

    Publications

    * indicates equal contribution

    dise PixelThink: Towards Efficient Chain-of-Pixel Reasoning
    Song Wang, Gongfan Fang, Lingdong Kong, Xiangtai Li, Jianyun Xu, Sheng Yang, Qiang Li, Jianke Zhu, Xinchao Wang
    arXiv, 2025
    [arXiv] [Code] [Project Page]
    dise Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps
    Sicheng Feng*, Song Wang*, Shuyi Ouyang, Lingdong Kong, Zikai Song, Jianke Zhu, Huan Wang, Xinchao Wang
    arXiv, 2025
    [arXiv] [Code] [Project Page] [中文解读]
    dise SAM4D: Segment Anything in Camera and LiDAR Streams
    Jianyun Xu*, Song Wang*, Ziqian Ni*, Chunyong Hu, Sheng Yang, Jianke Zhu, Qiang Li
    IEEE/CVF International Conference on Computer Vision (ICCV), 2025
    [arXiv] [Code] [Project Page]
    dise PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
    Song Wang, Xiaolu Liu, Lingdong Kong, Jianyun Xu, Chunyong Hu, Gongfan Fang, Wentong Li, Jianke Zhu, Xinchao Wang
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
    [arXiv] [Code]
    dise Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning
    Hanxun Yu*, Wentong Li*, Song Wang, Junbo Chen, Jianke Zhu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR Highlight), 2025
    [arXiv] [Code]
    dise Uncertainty-Instructed Structure Injection for Generalizable HD Map Construction
    Xiaolu Liu*, Ruizi Yang*, Song Wang, Wentong Li, Junbo Chen, Jianke Zhu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
    [arXiv] [Code]
    dise Reliable and Calibrated Semantic Occupancy Prediction by Hybrid Uncertainty Learning
    Song Wang, Zhongdao Wang, Jiawei Yu, Wentong Li, Bailan Feng, Junbo Chen, Jianke Zhu
    International Joint Conference on Artificial Intelligence (IJCAI), 2025
    [arXiv] [Code]
    dise PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
    Qijun Gan, Song Wang, Shengtao Wu, Jianke Zhu
    International Conference on Learning Representations (ICLR Spotlight), 2025
    [arXiv] [Code]
    dise TokenPacker: Efficient Visual Projector for Multimodal LLM
    Wentong Li*, Yuqian Yuan*, Jian Liu, Dongqi Tang, Song Wang, Jie Qin, Jianke Zhu, Lei Zhang
    International Journal of Computer Vision (IJCV), 2025
    [arXiv] [Code]
    dise Offboard Occupancy Refinement with Hybrid Propagation for Autonomous Driving
    Hao Shi*, Song Wang*, Jiaming Zhang, Xiaoting Yin, Zhongdao Wang, Zhijian Zhao, Guangming Wang, Jianke Zhu, Kailun Yang, Kaiwei Wang
    IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2025
    [arXiv] [Code]
    dise Label-efficient Semantic Scene Completion With Scribble Annotations
    Song Wang, Jiawei Yu, Wentong Li, Hao Shi, Kailun Yang, Junbo Chen, Jianke Zhu
    International Joint Conference on Artificial Intelligence (IJCAI), 2024
    [arXiv] [Code]
    dise Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
    Song Wang, Jiawei Yu, Wentong Li, Wenyu Liu, Xiaolu Liu, Junbo Chen, Jianke Zhu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [arXiv] [Code]
    dise MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction
    Xiaolu Liu, Song Wang, Wentong Li, Ruizi Yang, Junbo Chen, Jianke Zhu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
    [arXiv] [Code]
    dise Domain Adaptation Transformer for Unsupervised Driving-Scene Segmentation in Adverse Conditions
    Wenyu Liu, Song Wang, Jianke Zhu, Xuansong Xie, Lei Zhang
    IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2024
    [Paper] [Code]
    dise DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction
    Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang
    IEEE Transactions on Intelligent Transportation Systems (T-ITS), 2024
    [arXiv] [Code]
    dise Label-efficient Segmentation via Affinity Propagation
    Wentong Li*, Yuqian Yuan*, Song Wang, Wenyu Liu, Dongqi Tang, Jian Liu, Jianke Zhu, Lei Zhang
    Conference on Neural Information Processing Systems (NeurIPS), 2023
    [arXiv] [Code] [Project Page]
    dise Point2Mask: Point-supervised Panoptic Segmentation via Optimal Transport
    Wentong Li, Yuqian Yuan, Song Wang, Jianke Zhu, Jianshu Li, Jian Liu, Lei Zhang
    IEEE/CVF International Conference on Computer Vision (ICCV), 2023
    [arXiv] [Code]
    dise LiDAR2Map: In Defense of LiDAR-Based Semantic Map Construction Using Online Camera Distillation
    Song Wang, Wentong Li, Wenyu Liu, Xiaolu Liu, Jianke Zhu
    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
    [arXiv] [Code]
    dise Meta-RangeSeg: LiDAR Sequence Semantic Segmentation Using Multiple Feature Aggregation
    Song Wang, Jianke Zhu, Ruixiang Zhang
    IEEE Robotics and Automation Letters (RA-L with IROS, IF: 5.2), 2022
    [arXiv] [Code]
    Honors and Awards

  • Chen Tianzhou Scholarship, Zhejiang University
  • China's Optics Valley Scholarship, Donghu New Technology Development Zone
  • Graduate with Merit A Performance, Zhejiang University
  • Third Prize in China Graduate AI Innovation Competition, Ministry of Education
  • Award of Honor for Graduate, Zhejiang University
  • Outstanding Academic Scholarship, Zhejiang University
  • Outstanding Undergraduate Award, Zhejiang University
  • Zhejiang Provincial Government Scholarship, Zhejiang Province
  • Zhongtian Technology First-Class Scholarship, ZTT Group
  • Zhejiang University Scholarship - Second Prize, Zhejiang University
  • First Prize in Advanced Mathematics Competition, Zhejiang Province
  • Academic Services

  • Conference Reviewer: ICML 2025, IJCAI 2025, CVPR 2025, ICLR 2025, IROS 2025, NeurIPS D&B Track 2024-2025, NeurIPS 2024, ECCV 2024, ACM MM 2023-2024, ICRA 2024-2025, SynData4CV@CVPR 2024
  • Journal Reviewer: IEEE T-IP, IEEE T-ITS, IEEE T-CSVT, IEEE RA-L, IEEE T-IV

  • Website Template


    © Song Wang | Last updated: June 28, 2025