Keze Wang

Associate Professor

Email: kezewang@gmail.com

Address: D301A

Links: https://kezewang.com/

Brief Description

Dr. Keze Wang received his B.S. and Ph.D. degrees from Sun Yat-sen University in 2012 and 2017, respectively. He then pursued postdoctoral research at the University of California, Los Angeles (UCLA). In 2019, he was awarded a Ph.D. in Philosophy from The Hong Kong Polytechnic University.

Since his doctoral studies, Dr. Wang has been exploring how to reduce the dependence of deep learning on annotated training data and how to mine valuable information from massive unlabeled data. He proposed foundational paradigms such as “Guided–Self-paced–Collaborative Long-term Autonomous Learning” and introduced pseudo-labeling mechanisms. His research centers on deep representation learning that integrates domain knowledge and semantic information, advancing from perception to cognition, from supervised to self-supervised, and ultimately to autonomous learning. Through causal reasoning technologies inspired and guided by cognition, he has progressively established a theoretical and methodological system of visual computing and reasoning tailored for multimodal foundation models.

His work has been published in over 50 papers in leading international journals and conferences, including Cell sub-journals, IEEE PAMI, TNNLS, IJCV, TIP, TMM, TCSVT, as well as CVPR and ICCV, among which more than 30 are in top-tier conferences. His publications have been cited more than 3,300 times on Google Scholar, with the most-cited single paper reaching 924 citations. Four of his papers have been recognized as ESI Highly Cited Papers. He is the recipient of the Wu Wenjun Artificial Intelligence Natural Science Award (Second Prize), the CAAI Outstanding Doctoral Dissertation Award, and the AI 2000 Most Influential Scholar Nominee Award by a renowned international academic evaluation institution.

 

Objective

Recently, AI technologies centered on foundation models have witnessed rapid development, achieving remarkable breakthroughs in language understanding, image recognition, and multimodal interaction. However, current foundation models still face significant limitations in generalization, interpretability, and reliability. These shortcomings not only hinder their deployment in broader real-world applications but also pose unprecedented challenges for the long-term advancement of artificial intelligence.

To address these challenges, our research is driven by both theoretical innovation and practical application. We are committed to developing novel learning paradigms and reasoning strategies for multimodal foundation models, aiming to achieve disruptive improvements. Our ultimate goal is to substantially enhance the universality and robustness of large models, enabling them not only to perform efficient cross-modal understanding and expression but also to acquire advanced human-like cognitive abilities such as causal discovery, logical reasoning, and analogical thinking. This will lay a solid foundation for AI to move toward a new stage of autonomous learning and self-evolution.

We firmly believe this is not only a scientific task but also a technological mission for the future: to propel AI from being a “powerful tool” to becoming a “wise partner,” ultimately benefiting humanity.

Selected Honors & Awards

2022 AI 200最有影响力学者提名奖

2020 Volunteer highlight of IEEE Transactions on Pattern Analysis and Machine Intelligence 

2019 中国人工智能学会优秀博士学位论文奖(每年最多评10名)

2018 吴文俊人工智能自然科学奖(排名第二)

2016 国家博士研究生奖学金(排名前1%)

2015 国家博士研究生奖学金(排名前1%)

Academic Services

担任国际顶级会议International Conference on Learning Representations (ICLR) 的领域主席(Area Chair)

担任国际顶级会议Association for Computational Linguistics (ACL) 的领域主席(Area Chair)

担任国际知名期刊Image and Vision Computing的执行编辑

担任国际知名期刊The Visual Computer的副编辑

担任国际知名期刊The Journal of Visual Communication and Image Representation的副编辑

Reviewer for

– IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

– IEEE Transactions on Neural Networks and Learning Systems (TNNLS)

– Applied Soft Computing Journal

– IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)

– IEEE Transactions on Image Processing (TIP)

– IEEE Transactions on Multimedia (TMM)

– Pattern Recognition (PR)

– Neural Networks

– Neurocomputing

– International Conference on Computer Vision and Pattern Recognition (CVPR)

– European Conference on Computer Vision (ECCV) 

– Neural Information Processing Systems (NeurIPS) 

– AAAI Conference on Artificial Intelligence (AAAI) 

– IEEE International Conference on Computer Vision (ICCV) 

– International Joint Conference on Artificial Intelligence (IJCAI) 

– IEEE International Conference on Robotics and Automation (ICRA) 

– Winter Conference on Applications of Computer Vision (WACV) 

– Asian Conference on Computer Vision (ACCV) 

– International Conference on Pattern Recognition (ICPR) 

Selected Publication

  (*) denotes the corresponding author,(+) denotes equal contribution

[1] Jusheng Zhang, Kaitong Cai, Yijia Fan, Jian Wang, Keze Wang*. CF-VLM: CounterFactual Vision-Language Fine-tuning. To appear in NeurIPS 2025.

[2] Jusheng Zhang, Kaitong Cai, Yijia Fan, Ningyuan Liu, Keze Wang*. MAT-Agent: Learning to Dynamically Optimize Multi-Label Image Classification Training via Multi-Agent Collaboration. To appear in NeurIPS 2025.

[3] Jusheng Zhang, Yijia Fan, Zimo Wen, Jian Wang, Keze Wang*. Tri-MARF: A Tri-Modal Multi-Agent Responsive Framework for Comprehensive 3D Object Annotation. To appear in NeurIPS 2025.

[4] Jusheng Zhang, Yijia Fan, Wenjun Lin, Ruiqi Chen, Haoyi Jiang, Wenhao Chai, Jian Wang, Keze Wang*. GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning. To appear in NeurIPS 2025.

[5] Jusheng Zhang, Yijia Fan, Kaitong Cai, Zimeng Huang, Xiaofei Sun, Jian Wang, Chengpei Tang, Keze Wang*. DrDiff: Dynamic Routing Diffusion with Hierarchical Attention for Breaking the Efficiency-Quality Trade-off. To appear in EMNLP 2025.

[6] Yijia Fan, Jusheng Zhang, Keze Wang*. Towards More Efficient Post-training via Fourier Domain Adapter Framework. To appear in EMNLP 2025 Findings.

[7] Yijia Fan, Jusheng Zhang, Kaitong Cai, Jing Yang, Keze Wang*. CCG: Rare-Label Prediction via Neural SEM–Driven Causal Game. To appear in EMNLP 2025 Findings.

[8] Jusheng Zhang, Yijia Fan, Kaitong Cai, Xiaofei Sun, Keze Wang*. OSC: Cognitive Orchestration through Dynamic Knowledge Alignment in Multi-Agent LLM Collaboration. To appear in EMNLP 2025 Findings.

[9] Ziyi Tang, Zechuan Chen, Jiarui Yang, Jiayao Mai, Yongsen Zheng, Keze Wang*, Jinrui Chen, Liang Lin. AlphaAgent: LLM-Driven Alpha Mining with Regularized Exploration to Counteract Alpha Decay. In KDD 2025.

[10] Jusheng Zhang, Zimeng Huang, Yijia Fan, Ningyuan Liu, Mingyan Li, Zhuojie Yang, Jiawei Yao, Jian Wang, Keze Wang*. KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems. In ICML 2025.

[11] Zeqing Wang, Qingyang Ma, Wentao Wan, Haojie Li, Keze Wang*. Yonghong Tian. Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body. In CVPR 2025.

[12]  Wentao Wan, Zhuojie Yang, Yongcan Chen, Chenglin Luo, Ruilin Wang, Kehao Cai, Nan Kang, Liang Lin, Keze Wang*. SR-FoT: A Syllogistic-Reasoning Framework of Thought for Large Language Models Tackling Knowledge-based Reasoning Tasks. In AAAI 2025.

[13] Hefeng Wu, Hao Jiang, Keze Wang*, Ziyi Tang, Xianghuan He, Liang Lin, Improving Network Interpretability via Explanation Consistency Evaluation. In IEEE Transactions on Multimedia, 2024.

[14] Qingyi Liu, Jinhui Qin, Wenxuan Ye, Hao Mou, Yuxuan He, Keze Wang*. Adaptive Prompt Routing for Arbitrary Text Style Transfer with Pre-trained Language Models. In AAAI 2024.

[15] Linsheng Chen, Guangrun Wang, Liuchun Yuan, Keze Wang*, Ken Deng, Philip H.S. Torr. NeRF-VPT: Learning Novel View Representations with Neural Radiance Fields via View Prompt Tuning. In AAAI 2024.

[16] Hui Fu, Zeqing Wang, Ke Gong, Keze Wang, Tianshui Chen, Haojie Li, Haifeng Zeng, Wenxiong Kang. Mimic: Speaking Style Disentanglement for Speech-Driven 3D Facial Animation. In AAAI 2024.

[17] Lingling Li, Weicong Li, Qiyuan Ding, Chengpei Tang, Keze Wang*. Gesture Generation via Diffusion Model with Attention Mechanism. In ICASSP 2024.

[18] Junfan Lin, Keze Wang*, Ziliang Chen, Xiaodan Liang, Liang Lin. Towards Causality-Aware Inferring: A Sequential Discriminative Approach for Medical Diagnosis. IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), 2023.

[19] Xipeng Chen, Junzheng Zhang, Keze Wang, Pengxu Wei, Liang Lin. Multi-Person 3D Pose Estimation With Occlusion Reasoning. In IEEE Transactions on Multimedia, 2022.

[20] Yang Liu, Keze Wang*, Lingbo Liu, Haoyuan Lan, Liang Lin. TCGL: Temporal Contrastive Graph for Self-supervised Video Representation Learning. In IEEE Transactions on Image Processing (T-IP), 2022.

[21] Guangrun Wang, Keze Wang*, Guangcong Wang, Phillip HS Torr, Liang Lin. Solving Inefficiency of Self-supervised Representation Learning. In ICCV 2021.

[22] Arjun Akula+, Keze Wang+, Changsong Liu, Sari Saba-Sadiya, Hongjing Lu, Sinisa Todorovic, Joyce Chai, Song-Chun Zhu*, F-ToM: Explaining with Theory-of-Mind via Fault-Lines for Enhancing Human Trust in Image Recognition Models. In iScience, 2021.

[23] Yang Liu, Keze Wang*, Guanbin Li, Liang Lin. Semantics-aware Adaptive Knowledge Distillation for Sensor-to-Vision Action Recognition. To appear in IEEE Transactions on Image Processing (T-IP), 2021.

[24] Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, and Pengxu Wei. 3D Human Pose Machines with Self-supervised Learning. In IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), vol. 42, no. 5, pp. 1069– 1082, 2020.

[25] Junfan Lin, Zhongzhan Huang, Keze Wang*, Xiaodan Liang, Weiwei Chen, Liang Lin. Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp. In Proc. of International Conference on Robotics and Automation (ICRA), 2021.

[26] Qingxing Cao, Bailin Li, Xiaodan Liang, Keze Wang, Liang Lin*. Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding. To appear in IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021.

[27] Guangrun Wang, Guangcong Wang, Keze Wang, Xiaodan Liang, Liang Lin, Grammatically Recognizing Images with Tree Convolution. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2020.

[28] Guangrun Wang, Keze Wang, Liang Lin. Adaptively Connected Neural Networks. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.

[29] Keze Wang, Liang Lin, Xiaopeng Yan, Ziliang Chen, Dongyu Zhang, Lei Zhang. Self-supervised Sample Mining with Switchable Selection Criteria for Object Detection. In IEEE Transactions on Neural Networks and Learning Systems (T-NNLS), vol. 30, no. 3, pp. 834–850, 2019.

[30] Yukai Shi, Guanbin Li, Qingxing Cao, Keze Wang, Liang Lin. Face hallucination by attentive sequence optimization with reinforcement learning. In IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), vol. 42, no. 11, pp. 2809–2824, 2019.

[31] Hefeng Wu, Yafei Hu, Keze Wang, Hanhui Li, Lin Nie, Hui Cheng. Instance-aware representation learning and association for online multiperson tracking. In Pattern Recognition, vol. 94, pp. 25–34, 2019.

[32] Keze Wang, Liang Lin, Chuangjie Ren, Wei Zhang, Wenxiu Sun. Convolutional Memory Blocks for Depth Data Representation Learning. In Proceedings of the 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence (IJCAI), 2018.

[33] Keze Wang, Xiaopeng Yan, Dongyu Zhang, Lei Zhang, Liang Lin. Towards Human-Machine Cooperation: Self-supervised Sample Mining for Object Detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[34] Guanbin Li, Yuan Xie, Tianhao Wei, Keze Wang, Liang Lin. Flow Guided Recurrent Neural Encoder for Video Salient Object Detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.

[35] Liang Lin, Keze Wang, Deyu Meng, Wangmeng Zuo, and Lei Zhang. Active Self-Paced Learning for Cost-Effective and Progressive Face Identification. In IEEE Transactions on Pattern Analysis and Machine Intelligence (T-PAMI), vol. 40, no. 1, pp. 7–19, 2018.

[36] Hui Cheng, Zhuoqi Zheng, Jinhao He, Chongyu Chen, Keze Wang, Liang Lin. Embedding Temporally Consistent Depth Recovery for Real-time Dense Mapping in Visual-inertial Odometry. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 693–698, 2018.

[37] Keze Wang, Dongyu Zhang, Liang Lin, Ya Li, and Ruimao Zhang, Cost-Effective Active Learning for Deep Image Classification. In IEEE Transactions on Circuits and Systems for Video Technology (T-CSVT), vol. 27, no. 12, pp. 2591– 2600, 2017.

[38] Yukai Shi, Keze Wang, Chongyu Chen, Li Xu and Liang Lin. Structure-Preserving Image Superresolution via Contextualized Multi-task Learning. In IEEE Transactions on Multimedia (TMM), vol. 19, no. 12, pp. 2804–2815, 2017.

[39] Ziliang Chen, Keze Wang, Xiao Wang, Pai Peng, and Liang Lin. Deep Co-Space: Sample Mining Across Feature Transformation for Semi-supervised Learning. In IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2017.

[40] Mude Lin, Liang Lin, Xiaodan Liang, Keze Wang, and Hui Cheng, Recurrent 3D Pose Sequence Machines. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. (oral).

[41] Liang Lin, Keze Wang, Wangmeng Zuo, Meng Wang, Jiebo Luo, Lei Zhang, A Deep Structured Model with Radius–Margin Bound for 3D Human Activity Recognition. In International Journal of Computer Vision (IJCV), 118(2), 256- 273, 2016.

[42] Keze Wang, Liang Lin, Jiangbo Lu, Chenglong Li, Keyang Shi, PISA: Pixelwise Image Saliency by Aggregating Complementary Appearance Contrast Measures with Edge-Preserving Coherence. In IEEE Transactions on Image Processing (TIP), 24(10), 3019-3033, 2015.

[43] Keze Wang, Shengfu Zhai, Hui Cheng, Xiaodan Liang, Liang Lin. Human Pose Estimation from Still Depth Image via Inference Embedded Multi-task Learning. In Proceedings of the ACM International Conference on Multimedia (ACM MM), 2016. (oral, full paper)

[44] Keze Wang, Liang Lin, Wangmeng Zuo, Shuhang Gu, Lei Zhang. Dictionary Pair Classifier Driven Convolutional Neural Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.

[45] Keze Wang, Xiaolong Wang, Liang Lin, Meng Wang, Wangmeng Zuo, 3D human activity recognition with reconfigurable convolutional neural networks. In Proceedings of the ACM International Conference on Multimedia (ACM MM), pp. 97-106, 2014. (oral, full paper)

[46] Yukai Shi, Keze Wang, Li Xu, Liang Lin, Local and Holistic- Structure Preserving Image Super Resolution via Deep Joint Component Learning. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2016. (oral)

[47] Linnan Zhu, Keze Wang, Liang Lin, Lei Zhang, Learning a Lightweight Deep Convolutional Network for Joint Age and Gender Recognition. In Proceedings of the IEEE International Conference on Pattern Recognition (ICPR), 2016. (oral)

[48] Keyang Shi, Keze Wang, Jiangbo Lu, Liang Lin, Pisa: Pixelwise image saliency by aggregating complementary appearance contrast measures with spatial priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2115-2122, 2013.