
Qing Li 李庆
Email: dylan.liqing[at]
gmail[dot]
com
I am a research scientist and team lead at Beijing Institute for General Artificial Intelligence (BIGAI), China. I received my Ph.D. in 2022 from University of California, Los Angeles (UCLA), advised by Professor Song-Chun Zhu. During my Ph.D., I have interned at Google Research, Microsoft Azure AI and Amazon Alexa. Before UCLA, I obtained my degrees of Bachelor in 2015 and Master in 2018 from University of Science and Technology of China (USTC).
My long-term research goal is to develop a generalist agent that can perceive the real world, communicate with humans, and learn from feedback. To achieve this goal, I currently focus on:
- AGI Agents: LLM Agents, Vision-Language-Action (VLA), Embodied Agents
- Multimodal Understanding: Vision-Language Modeling (VLM), 3D Visual Grounding, Long-term Video Understanding
- Machine Learning: Neural-Symbolic Learning, Continual Learning, In-Context Learning
Our team is actively recruiting full-time research scientists, engineers, and self-motivated interns. We are also seeking prospective PhD students and long-term collaborators for TongProgam (通计划). Feel free to contact me if you are interested!
News
2025-06 | Two papers are accepted by ICCV 2025! Check out these works: MTU3D (receiving perfect review scores) and Embodied VideoAgent. |
---|---|
2025-03 | Two papers are accepted by CVPR 2025! |
2025-01 | Two papers are accepted by ICLR 2025! Check out these awesome works: Multimodal Knowledge Editing and Multimodal Agent Tuning (Spotlight). |
2024-11 | I am selected as Top Reviewers of NeurIPS 2024. |
2024-08 | 🔥🔥🔥 Three papers are accepted by NeurIPS 2024! Check out these awesome works: FIRE, A dataset for feedback refining of large multimodal models. UltraEdit, a large-scale (~4M) high-quality dataset for instruction-based image editing. OmniJARVIS, a novel Vision-Language-Action (VLA) model for instruction following in Minecraft. |
Selected Publications
- Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation Best Paper FinalistInternational Conference on Multimedia Retrieval, 2016