
Qing Li 李庆
Email: dylan.liqing[at]
gmail[dot]
com
I am a research scientist and team lead at Beijing Institute for General Artificial Intelligence (BIGAI), China. I received my Ph.D. in 2022 from University of California, Los Angeles (UCLA), advised by Professor Song-Chun Zhu. During my Ph.D., I have interned at Google Research, Microsoft Azure AI and Amazon Alexa. Before UCLA, I obtained my degrees of Bachelor in 2015 and Master in 2018 from University of Science and Technology of China (USTC).
My long-term research goal is to develop a generalist agent that can perceive the 3D world, communicate with humans, and learn from feedback. To achieve this goal, I am currently interested in:
- Multimodal Understanding: Multimodal LLMs, 3D LLMs, Long-term Video Understanding
- Multimodal Agents: LLM Agents, Vision-Language-Action (VLA), Embodied Agents
Our team is actively recruiting full-time research scientists, engineers, and self-motivated interns. We are also seeking prospective PhD students and long-term collaborators for TongProgam (通计划). Feel free to contact me if you are interested!
News
2025-06 | Two papers are accepted by ICCV 2025! Check out these works: MTU3D (receiving perfect review scores) and Embodied VideoAgent. |
---|---|
2025-04 | I am invited as an Area Chair for NeurIPS 2025. |
2025-03 | Two papers are accepted by CVPR 2025! |
2025-01 | Two papers are accepted by ICLR 2025! Check out these awesome works: Multimodal Knowledge Editing and Multimodal Agent Tuning (Spotlight). |
2024-11 | I am selected as Top Reviewers of NeurIPS 2024. |
Selected Publications
- Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation Best Paper FinalistInternational Conference on Multimedia Retrieval, 2016