I am currently a Ph.D. candidate at HMI Lab, NERCV²T, School of Computer Science, Peking University, supervised by Prof. Shanghang Zhang. Before that, I received my Bachelor's degree in Artificial Intelligence (Turing Honor Degree) from PKU, where I also obtained a Bachelor's degree in Economics.
My research interests lie in multimodal large language models, including visual foundation models, vision language models, unified multimodal models, visual complex reasoning, visual model efficiency, and visual continual learning. The overall goal of my research is to develop a large-scale efficient visual perception system with human-like expression, adaptation, and generalization, equipped with powerful abilities including fundamental perception, cognitive reasoning, and autonomous creativity.
More specifically, my research interests include:
Ph.D. Candidate in Computer Application Technology
Sep. 2023 -- Jun. 2028 (ETA)
Peking University, Beijing, China
Bachelor of Intelligence Science and Technology & Economics (Dual Degree)
Sep. 2019 -- Jun. 2023
Peking University, Beijing, China
Intern at AI Lab (Model Efficiency & Unified Models)
Mar. 2024 -- Jun. 2026
ByteDance, Beijing, China
Intern in AGI (Memory Mechanism for MLLM)
Jul. 2023 -- Sep. 2023
BAAI, Beijing, China
Intern in Computer Vision (Autonomous Driving)
Sep. 2022 -- Feb. 2023
OPPO, Beijing, China
Intern at GCV Lab (Multimodal Learning)
Oct. 2021 -- Feb. 2022
BIGAI, Beijing, China