Iām a Ph.D. student at New Laboratory of Pattern Recognition(NLPR), the University of Chinese Academy of Sciences, advised by Prof. Yan Huang. I strongly believe in the power of interdisciplinary collaboration and the potential it holds for driving impactful research outcomes. If you are interested in partnering on research projects, offering internship opportunities or exchange programs, I would be thrilled to connect with you.
My research interests cover Multimodal Large Language Models.
š„ News
- 2026.01: Ā šš One paper on Browser Agent were accepted by TMLR!
- 2025.11: Ā šš Two papers on Multi-View Clustering and Deepfake were accepted by AAAI 2026!
- 2025.07: Ā šš One technical report on Kwai Keye-VL was released!
- 2025.05: Ā šš One paper on DPO (Direct Preference Optimization) was accepted by ICML 2025!
- 2025.02: Ā šš One paper on GUI Agent was accepted by CVPR 2025!
- 2024.06: Ā šš One paper on Knowledge Editing Benchmark was accepted by NeurIPS 2024 Datasets and Benchmarks Track!
- 2023.08: Ā šš One paper on Mobile Agent was accepted by Mobicom 2024 Summer Round!
š Publications

PaperX: A Unified Framework for Multimodal Academic Presentation Generation with Scholar DAG
Ā Project
Tao Yu, Minghui Zhang, Zhiqing Cui, Hao Wang, Zhongtian Luo, Shenghua Chai, Junhao Gong, Yuzhao Peng, Yuxuan Zhou, Yujia Yang, Zhenghao Zhang, Haopeng Jin, Xinming Wang, Yufei Xiong, Jiabing Yang, Jiahao Yuan, Hanqing Wang, Hongzhu Yi, YiFan Zhang, Yan Huang, Liang Wang

ShotFinder: Imagination-Driven Open-Domain Video Shot Retrieval via Web Search
Ā Project
Tao Yu, Haopeng Jin, Hao Wang, Shenghua Chai, Yujia Yang, Junhao Gong, Jiaming Guo, Minghui Zhang, Xinlong Chen, Zhenghao Zhang, Yuxuan Zhou, Yufei Xiong, Shanbin Zhang, Jiabing Yang, Hongzhu Yi, Xinming Wang, Cheng Zhong, Xiao Ma, Zhang Zhang, Yan Huang, Liang Wang

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions
Ā Project
Tao Yu, Zhengbo Zhang, Zhiheng Lyu, Junhao Gong, Hongzhu Yi, Xinming Wang, Yuxuan Zhou, Jiabing Yang, Ping Nie, Yan Huang, Wenhu Chen

Aligning Multimodal LLM with Human Preference: A Survey
Ā Project
Tao Yu, Yi-Fan Zhang, Chaoyou Fu, Junkang Wu, Jinda Lu, Kun Wang, Xingyu Lu, Yunhang Shen, Guibin Zhang, Dingjie Song, Yibo Yan, Tianlong Xu, Qingsong Wen, Zhang Zhang, Yan Huang, Liang Wang, Tieniu Tan
Others
-
AAAI 2026, Improving Deepfake Detection with Reinforcement Learning-Based Adaptive Data Augmentation
Yuxuan Zhou, Tao Yu, Wen Huang, Yuheng Zhang, Tao Dai, Shu-Tao Xia -
AAAI 2026, Dynamic Deep Graph Learning for Incomplete Multi-View Clustering with Masked Graph Reconstruction Loss
Zhenghao Zhang, Jun Xie, Xingchen Chen, Tao Yu, Hongzhu Yi, Kaixin Xu, Yuanxiang Wang, Tianyu Zong, Xinming Wang, Jiahuan Chen, Guoqing Chao, Feng Chen, Zhepeng Wang, Jungang Xu
Technical Report, Kwai Keye-VL Technical Report
Kwai Keye Team
ICML 2025, MM-RLHF: The Next Step Forward in Multimodal LLM Alignment
Yi-Fan Zhang, Tao Yu, Haochen Tian, Chaoyou Fu, Peiyan Li, Jianshu Zeng, Wulin Xie, Yang Shi, Huanyu Zhang, Junkang Wu, Xue Wang, Yibo Hu, Bin Wen, Fan Yang, Zhang Zhang, Tingting Gao, Di Zhang, Liang Wang, Rong Jin, Tieniu Tan
CVPR 2025, GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration
Yuchen Sun, Shanhui Zhao, Tao Yu, Hao Wen, Samith Va, Mengwei Xu, Yuanchun Li, Chongyang Zhang
NeurIPS 2024, VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark
Han Huang, Haitian Zhong, Tao Yu, Qiang Liu, Shu Wu, Liang Wang, Tieniu Tan
Mobicom 2024, Autodroid: Llm-powered task automation in android
Hao Wen, Yuanchun Li, Guohong Liu, Shanhui Zhao, Tao Yu, Toby Jia-Jun Li, Shiqi Jiang, Yunhao Liu, Yaqin Zhang, Yunxin Liu
š Honors and Awards
- 2024.12, National Scholarship.(0.4%)
- 2023.12, National Encouragement Scholarship.(0.4%)
- 2022.12, National Scholarship.(0.4%)
- 2022.06, First Grade Scholarship.(0.4%)
š Educations
- 2025.09 - Current, Ph.D. Student in Pattern Recognition and Intelligent Systems (Institute of Automation, Chinese Academy of Sciences)
- 2021.09 - 2025.06, Bachelor in Computer Science and Technology (School of Computer Science and Technology, Harbin Institute of Technology), GPA: 93.79/100 (Ranking: 2/135)
š» Internships
- 2025.01 - 2025.07, Multimodal Understanding and Application Group, Kuaishou Technology, China.
- 2024.03 - 2024.06, New Laboratory of Pattern Recognition(NLPR), Institute of Automation, China.
- 2023.05 - 2024.07, Institute for AI Industry Research(AIR), Tsinghua University, China.