Hu Jianfang

Associate Professor

Email: hujf5@mail.sysu.edu.cn

Links: https://isee-ai.cn/~hujianfang/

教师简介: 

       中山大学副教授,博士生导师。在国际顶级会议(CVPR、ICCV、NeurIPS等)和一区期刊(IEEE TPAMI等)发表论文60余篇,获广东省杰出青年基金、中山大学优秀指导教师、广东省科学技术奖自然科学奖二等奖、中国图象图形学学会优秀博士学位论文奖(全国仅4篇)、CVPR Workshop 2021 Person In Context Challlenge 最佳论文奖,获国际顶级会议ACM MM 2022 Workshop 跨模态视频-语言定位竞赛和国际顶级会CVPR2018大规模行为识别等比赛冠军。指导的24届毕业生谭超镭同学获中山大学优秀硕士学位论文奖(论文题目:基于自然语言描述的视频内容定位),全学院仅4人入选。已毕业学生多数去腾讯、阿里妈妈、科大讯飞、南方电网、中移动和字节等大厂或重要国企,或升学读博(境外/海外名校或留本团队)。

 

      26级硕士生可选研究方向:多模态视频理解与定位,可控视频生成/编辑,3D运动视频生成,大模型微调,模型攻击与安全,持续学习(偏机器学习),视觉语言动作模型(VLA,Vision-Language-Action)和其它自选方向。其中视觉语言动作模型方向与郑伟诗教授的机器人团队一起合作。

      26级博士生可选研究方向:物理规律引导的视频生成,多模态视频理解与定位。

 

      硕士要求:具有较强的编程能力,英语基础好(过4、6级),勤奋,有自驱力

      博士要求:有扎实的科研基础和能力,具有较强的自驱力,热爱科研。

      指导方式:导师+海外博士师兄+组内师兄师姐联合指导,每周固定讨论1-2次,有机会安排到一流企业和科研团队进行科研实习。

        

      实验室计算资源充足,欢迎2026年9月入学的夏令营/保研读博同学联系,发送简历至 hujf5@mail.sysu.edu.cn,并抄送 2296495708@qq.com.

 

 

研究领域: 

 计算机视觉,视频内容理解,多模态大模型,智慧海洋等

教育背景: 

- 2010年9月~2016年12月:中山大学数学学院硕博连读

- 2015年9月~2016年9月:新加坡南洋理工大学访问学生

- 2006年9月~2010年6月:中山大学数学与应用数学 本科

工作经历: 

- 2019年11月至今:中山大学计算机学院 副教授

- 2017年1月2019年10月:中山大学数据科学与计算机学院 副研究员

获奖及荣誉: 

-  2022年 国际顶级会议ACM MM Workshop 跨模态视频-语言定位竞赛冠军

-  2020年 广东省科学技术奖自然科学奖 二等奖

-  2017年 中国图象图形学学会优秀博士学位论文

-  2017年 微软亚洲研究院青年学者铸星计划

-  2018年 CVPR大规模行为识别比赛MiniTrack 冠军

科研项目: 

-  国家自然科学基金面上项目,跨模态视频-文本匹配与生成研究,项目负责人

-  广东省基础与应用基础研究基金杰出青年项目,面向复杂动态场景的视频解析研究,项目负责人

-  国家自然科学基金面上项目,多源视频结构化特征提取方法研究及其在行为分析的应用,项目负责人

-  国家自然科学基金青年科学基金,基于多模态时间序列异质特征学习的行为前期预测研究,项目负责人

-  CCF-腾讯犀牛鸟科研基金项目,基于多模态动态融合的视频动作识别及意图预测,项目负责人

-  企业横向项目(与微信合作),基于多层次特征动态融合学习的视频动作识别,项目负责人

教授课程: 

本科生课程:数学分析、数值最优化

研究生课程:模式识别、最优化理论与方法

代表性论著: 

(*) 表示为通讯作者,指导学生发表论文

- Jianwei Tang(指导的硕士生), Hong Yang, Tengyue Chen, and Jian-Fang Hu*, Stochastic Human Motion Prediction with Memory of Action Transition and Action Characteristic, IEEE/CVF Computer Vision and Pattern Recognition (CVPR) 2025. [CCF A,计算机视觉顶级会议]

- Fuxing Liu(指导的硕士生), Chaolei Tan, Xiaotong Lin, Yonggang Qi, Jinxuan Li, Jian-FangHu*, "SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection", Association for the Advancement of Artificial Intelligence (AAAI), 2025. [CCF A,人工智能顶级会议]

- Tianming Liang(指导的博士生), Linhui Li, Jian-FangHu*, Xiangyang Yu, Wei-Shi Zheng, and Jianhuang Lai. "Rethinking Temporal Context in Video-QA: A Comprehensive Study of Single-frame Static Bias", IEEE Transactions on Multimedia (TMM), 2024.

- Linhui Li(指导的硕士生), Xiaotong Lin, Yejia Huang, Zizhen Zhang*, and Jian-Fang Hu*, "Beyond Minimum-of-N: Rethinking the Evaluation and Methods of Pedestrian Trajectory Prediction", IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT), 2024.

- Chaolei Tan(指导的硕士生), Zihang Lin, Junfu Pu, Zhongang Qi, Wei-Yi Pei, Zhi Qu, Yexin Wang, Ying Shan, Wei-Shi Zheng, Jian-Fang Hu*, "SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses", ACM Multimedia (ACM MM), 2024. [CCF A,多媒体顶级会议]

-  Xiaotong Lin(指导的硕士生), Tianming Liang, Jianhuang Lai, and Jian-Fang Hu*,"Progressive Pretext Task Learning for Human Trajectory Prediction",European Conference on Computer Vision (ECCV), 2024. [CCF B,计算机视觉顶级会议]

-  Tianming Liang(指导的博士生), Chaolei Tan, Beihao Xia, Wei-Shi Zheng, Jian-Fang Hu*, Ranking Distillation for Open-Ended Video Question Answering with Insufficient Labels,IEEE/CVF Computer Vision and Pattern Recognition (CVPR) 2024. [CCF A,计算机视觉顶级会议]

-  Chaolei Tan(指导的硕士生), Jianhuang Lai, Wei-Shi Zheng, Jian-Fang Hu*, Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding,IEEE/CVF Computer Vision and Pattern Recognition (CVPR) 2024. [CCF A,计算机视觉顶级会议]

-  Jianwei Tang(指导的硕士生), Jiangxin Sun, Xiaotong Lin, Lifang Zhang, Wei-Shi Zheng, Jian-Fang Hu*, "Temporal Continual Learning with Prior Compensation for Human Motion Prediction", Conference on Neural Information Processing Systems (NeurIPS), 2023. [CCF A,计算机视觉顶级会议]

-  Zihang Lin(指导的硕士生), Chaolei Tan, Jian-Fang Hu*, Zhi Jin, Tiancai Ye, and Wei-Shi Zheng, " Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023[CCF A,计算机视觉顶级会议]

-  Chaolei Tan(指导的硕士生), Zihang Lin, Jian-Fang Hu*, Wei-Shi Zheng, and Jianhuang Lai, " Hierarchical Semantic Correspondence Networks for Video Paragraph Grounding", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023[CCF A,计算机视觉顶级会议]

-  Jiangxin Sun(指导的硕士生), Chunyu Wang, Huang Hu, Hanjiang Lai, Zhi Jin, Jian-Fang Hu*,You Never Stop Dancing: Non-freezing Dance Generation via Bank-constrained Manifold Projection, Conference and Workshop on Neural Information Processing Systems (NeurIPS), 2022. [CCF A,计算机视觉顶级会议]

-  Jiangxin Sun(指导的硕士生), Zihang Lin, Xintong Han, Jian-Fang Hu*, Jia Xu, Wei-Shi Zheng, Action-guided 3D Human Motion Prediction, Conference and Workshop on Neural Information Processing Systems (NeurIPS), 2021.

-  Zihang Lin# (指导的硕士生), Jiangxin Sun#(指导的研一学生), Jian-Fang Hu(*), Qizhi Yu, Wei-Shi Zheng, Jianhuang Lai, Predictive Feature Learning for Future Segmentation Prediction, International Conference on Computer Vision (ICCV) 2021. [CCF A,计算机视觉三大顶级会议之一]

-  Xionghui Wang (指导的硕士生), Jian-Fang Hu (*), Jianhuang Lai, Jianguo Zhang, and Wei-Shi Zheng, ``Progressive Teacher-student Learning for Early Action Prediction", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.  [CCF A,计算机视觉顶级会议]

-  Guoliang Pang (指导的硕士生), Xionghui Wang (指导的硕士生), Jian-Fang Hu (*), Qing Zhang, and Wei-Shi Zheng, DBDNet: Learning Bi-directional Dynamics for Early Action Prediction, International Joint Conference on Artificial Intelligence (IJCAI), 2019. (CCF A, 人工智能顶级会议)

-  Jiangxin Sun (指导的本科生), Jiafeng Xie (指导的硕士生) , Jian-Fang Hu (*), Zihang Lin (指导的本科生), Wei-Shi Zheng, Jianhuang Lai and Wenjun Zeng, Predicting Future Instance Segmentation with Contextual Pyramid ConvLSTMs, ACM Multimedia, 2019. (CCF A类会议)

Jian-Fang Hu#, Jiangxin Sun# , Zihang Lin, Jianhuang Lai, Wenjun Zeng, and Wei-Shi Zheng(*). "APANet: Auto-Path Aggregation for Future Instance Segmentation Prediction", IEEE Transaction on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 2021. [CCF A, IF=17.73, 人工智能与计算机视觉领域最顶级期刊]

Jian-Fang Hu, Wei-Shi Zheng(*), Jianhuang Lai, and Jianguo Zhang, "Jointly Learning Heterogeneous Features for RGB-D Activity Recognition", IEEE Transaction on Pattern Analysis and Machine Intelligence (IEEE TPAMI), 39 (11): 2186-2200 (2017). [CCF A, IF=17.73, 人工智能与计算机视觉领域最顶级期刊]

Jian-Fang Hu, Wei-Shi Zheng(*), Lianyang Ma, Gang Wang, Jianhuang Lai, and Jianguo Zhang. "Early Action Prediction by Soft Regression". IEEE Transaction on Pattern Analysis and Machine Intelligence (IEEE TPAMI), August 3, 2018;  DOI: 10.1109/TPAMI.2018.2863279, [CCF A, IF=17.73, 人工智能与计算机视觉领域最顶级期刊]

Jian-Fang Hu, Wei-Shi Zheng(*), Xiaohua Xie, and Jianhuang Lai, "Sparse Transfer for Facial Shape-from-Shading", Pattern Recognition (PR), 68(8): 272-285 (2017). [IF=3.965, CCF B类期刊论文]

Jian-Fang Hu, Wei-Shi Zheng(*), Jianhuang Lai, Shaogang Gong, and Tao Xiang, "Exemplar-based Recognition of Human-Object Interactions", IEEE Transaction Circuits System Video Technology (TCSVT). 26(4): 647-660 (2016). [IF=3.599, CCF B类期刊论文]

Jian-Fang Hu, Wei-Shi Zheng(*), Jiahui Pan, Jianhuang Lai, and Jianguo Zhang, "Deep Bilinear Learning for RGB-D Action Recognition", European Conference on Computer Vision (ECCV), 2018. [计算机视觉三大顶级会议之一]

Jian-Fang Hu, Wei-Shi Zheng(*), Jianhuang Lai, and Jianguo Zhang, "Jointly Learning Heterogeneous Features for RGB-D Activity Recognition", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5344-5352 (2015). [CCF A,计算机视觉顶级会议]

Jian-Fang Hu, Wei-Shi Zheng(*), Jianhuang Lai, Shaogang Gong, and Tao Xiang, "Recognising Human-Object Interaction via Exemplar based Modelling'', International Conference on Computer Vision (ICCV), 3144-3151 (2013). [CCF A,计算机视觉三大顶级会议之一]

Jian-Fang Hu, Wei-Shi Zheng(*), Lianyang Ma, Gang Wang, and Jianhuang Lai, "Real-time RGB-D Activity Prediction by Soft Regression", European Conference on Computer Vision (ECCV), 280-296 (2016). [计算机视觉三大顶级会议之一]