Hi, there! I’m Manling. I am an assistant professor of Computer Science at Northwestern University, affiliated with Center for Robotics and Biosystems and Cognitive Science. I direct the Machine Learning and Language (MLL) Lab. I am also an Amazon Scholar working on conversational agents. Prior to this, I was a postdoc at Stanford University (2024), mainly working with the advisor Jiajun Wu, and Fei-Fei Li. I obtained my PhD at UIUC (2023), advised by Heng Ji, closely working with Shih-Fu Chang, Kyunghyun Cho and Jiawei Han. I am a recipient of MIT Tech Review 35 Under 35 in 2025, ACL Inaugural Best Dissertation Award Honorable Mention in 2025, DARPA Riser in 2022, and a EECS Rising Star in 2022, Microsoft Research PhD Fellowship in 2021, etc. Our work on multimodal reasoning was recognized as ACL'24 Outstanding Paper Award, ACL'20 Best Demo Paper Award, NAACL'21 Best Demo Paper Award, etc. I served as organizing committee of ACL'25 (virtual chairs), NAACL'25 (publication chairs), EMNLP'24 (demo chairs), etc.

Manling Li, foundation model, embodied agents — Towards Spatial Intelligence: Architecting the Reasoning Interface for Embodied Foundation Agents

I aim to evolve AI models from passive observers into active, embodied agents by bridging the critical gap of Spatial Intelligence. Today's foundation models, trained on static data, rich in semantic priors, yet lack grounded interactions with the dynamic and partially observable physical world. We work on Reasoning and Planning at the intersection of Language, Vision, and Robotics, to equip AI with an internal physics engine.

Seeing the unseen: Spatial Intelligence, Multimodality (Language + Vision/Robotics).
→ show/hide work on moving from semantic-centric priors to spatial-geometric reasoning.

- Spatial Reasoning: MindCube, AdaptVis, Vector Graphics Reasoning, LayoutVLM.
- World Modeling: VAGEN within RL, Self-Play Agent before RL, Embodied Agent Interface with BDDL transition models
- Long-Horizon multimodal intelligence: T* for temporal search, HourVideo, LM4Video, VideoArgument, VideoEvent,
- Multimodal alignment: CLIP-Event, VisualDecompsition, Knowledge-Driven Vision-Language Pretraining, VisualKnowledge, MuMuQA, M2E2, Image2Code.

Exploring the world: World Modeling and Foundation Agent Training.
→ show/hide work on self-evolving agent training from exploitation to exploration via world modeling.

- Foundation Agent Training: LLM Agents (RAGEN, Self-Play Agent), VLM Agents (VAGEN with World Model RL)
- Embodied Agents: Embodied Reasoning Agent for RL training, Embodied Agent Interface for LLM agents, EmbodiedBench for VLM agents, ROSETTA with human feedback, IKEA Manual at Work for 4D grouded planning.
- Planning and Compositionality: Planning Logic, Procedural Planning, Agriculture Task Planning, LLM Schema, Graph Schema, Generative Graph Schema, PathLM, Decomposition to Latent Text Prompt.

Certifying the reasoning: Mechanistic Interpretability, Safety, and Alignment.
→ show/hide work on turning black boxes to safe, transparent, and trustworthy agents.

- Open up models to understand inner workings: Why is Spatial Reasoning Hard for VLMs?, Reasoning/Perception Merging, Exploring Diffusion Transformer Designs via Grafting.
- Physics of LLMs/VLMs: Knowledge Overshadow, Ripple Effect,
- Control and intervene foundation models: LM-Steer, Hallu-Control, Deep Concept Injection. - NLP + Human/Social (theory of mind, debiasing, propoganda, knowledge graph, information extraction): SmartBook, Explainable Fact Checking, Timeline Summarization, Meeting Summarization, GAIA IE, Info Propagation Pattern, DebiasPrompt, Multilingual KG, Object Hallucination.
- AI for Science, claim feasibility verification, figure/chart editing, paper/review generation: COVID-19 Claim Radar, COVID Knowledge Graph.

Prospective students: I have several PhD positions in Fall 2026 and intern positions. » Show details

>> We have amazing new faculty at NU working on robotics and foundation models, please check out Ruohan! He also has several PhD positions in Fall 2026.
>> If you are interested in PhD at Northwestern, please apply to NorthwesternCS. Due to the large amount of emails, I aplogize that I will not able to reply to individual emails (Please note that non-reply does NOT indicate non-interesting, largely means emails got missed or I unfortunately did not get time to check such emails before applications). Please choose me as a potential advisor in the application, and I will check every application carefully in late Dec and do interviews in Jan-Mar.
>> If you are interested in doing research internship with our group, please feel free to talk to any of the PhD students to join their projects, and many of them are looking for collaborators. The best way is to submit this form and drop an email to limanling.ai@gmail.com. This mailbox has been checked more frequently.
>> Prospective_Students_English
>> Prospective_Students_Chinese
>> Proud of what our students have achieved in the very first year!

Junior PhD/master/undergraduate students: I will dedicate 30 minutes each week to offer guidance/suggestions/mentorship, especially for students from underrepresented groups or whoever is in need. If you would like to chat about life, career path, graduate school applications, or research ideas related to AI/ML, feel free to file the form to schedule a meeting.

Invited Talks

[2025/10] Talk at Columbia University NLP Seminar on From Large Language Models to Large Agent Models: Reasoning Interface with World Modeling.
[2025/10] Talk at University of Pennsylvania NLP Seminar on "From Large Language Models to Large Agent Models: Reasoning Interface with World Modeling".
[2025/10] Talk at Capital One on "Conversational Agent Training".
[2025/10] Talk at Amazon on "Conversational Agent Training".
[2025/10] Interview by MIT Tech Review on Deepseek-OCR.
[2025/10] Invited by Dreamforce 2025 as a panelist on AI Research Faculty Panel.
[2025/10] Tutorial at ICCV 2025 on Foundation Models Meet Embodied Agents.
[2025/10] Tutorial at ICCV 2025 on Safe Multi-Modal Learning.
[2025/10] Talk at ICCV 2025 Workshop on Multimodal Reasoning for Agentic Intelligence (MMRAgI).
[2025/10] Talk at ICCV 2025 Workshop on Multimodal Spatial Intelligence (MUSI).
[2025/10] Talk at ICCV 2025 Workshop on Memory and Vision (MemVis).
[2025/10] Talk at ICCV 2025 Workshop on Long Multi-Scene Video Foundations (LongVid-Foundations) on "Re-thinking Temporal Search for Long-Form Video Understanding".
[2025/10] Talk at ICCV 2025 Workshop on Structural Priors for Vision (SP4V) for the Best Paper Award on Spatial Mental Modeling from Limited Views.
[2025/09] Talk at University of Edinburgh CHAI Seminar Series on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/09] Talk at Stanford University on "Theory of Space: How LLMs develop spatial internal beliefs".
[2025/09] Talk at Agentic AI Frontier Seminar on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/08] Talk at Agent AI Summit at UC Berkeley on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/08] Talk at Stanford University on "LLM Alignment with Control Barrier Functions".
[2025/07] Talk at Cross Future AI Summit on "See, Think, Act: Agent Training By Reinforcement Reasoning".
[2025/07] Talk at CVPR 2025 Workshop on Visual Concepts on "Why is Spatial Concept Learning Hard?".
[2025/07] Talk at Apple Workshop on Reasoning and Planning on "Training Agents with World Model Reasoning".
[2025/07] Talk at ACC 2025 Workshop on LLMs in Control Design and Decision Making on "LLMs for Embodied Decision Making".
[2025/06] Talk at Midwest Machine Learning Symposium 2025 on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/05] Talk at Google DeepMind on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/05] Talk at NAACL 2025 Workshop onKnowledge-Augmentation for Language Models and NLP Methods on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/05] Tutorial at NAACL 2025 on Foundation Models Meet Embodied Agents.
[2025/04] Talk at UIUC NLP Seminar on "RAGEN: Training Agents by Reinforcing Reasoning".
[2025/04] Guest Lecture at University of Michigan EECS 692 Advanced Artificial Intelligence on "Reasoning and Planning with Physical World".
[2025/02] Talk at AAAI 2025 New Faculty Highlights on "Agent Training Under a MDP Formulation".
[2025/02] Talk at AAAI 2025 Workshop on LM4Plan on "Agent Training Under a MDP Formulation".
[2025/02] Talk at AAAI 2025 Bridge on Foundation Models and Planning on "Agent Training Under a MDP Formulation".
[2025/02] Tutorial at AAAI 2025 on Foundation Models Meet Embodied Agents [Website/Slides].
[2025/02] Tutorial at AAAI 2025 on Lifecycle of Knowledge in LLMs: Memorization, Editing, and Beyond [Website/Slides].
[2024/12] Talk at SFU @ NeurIPS 2024 on "Embodied Agent Interface: LLMs and VLMs for Embodied Reasoning and Planning".
[2024/11] Talk at EMNLP 2024 Birds of a Feather on "LLMs for Embodied Agents".
[2024/11] Talk at EMNLP 2024 CustomNLP4U Workshop on "Customizing Large Language Models to Embodied Agents".
[2024/10] Talk at Adobe Research on "Chart Reasoning Agents".
[2024/09] Keynote at Amazon-Illinois Center on AI for Interactive Conversational Experiences Fall Research Symposium 2024 on "From Large Language Models to Large Agent Models".
[2024/09] Talk at 2024 Allerton Conference on Communication, Control, and Computing on "Reasoning and Planning with Physical World Knowledge".
[2024/08] Talk at TTIC Multimodal AI Workshop 2024 on "Embodied Agent Interface: LLMs for Embodied Decision Making".
[2024/08] Talk at Summer Institute in Computational Social Science 2024 on "Multimodal Knowledge for Social Good".
[2024/08] Tutorial at IJCAI 2024 on Beyond Human Creativity: A Tutorial on Advancements in AI Generated Content (AIGC) [Website/Slides].
[2024/07] Talk at SpLU-RoboNLP 2024 Workshop on "Reasoning, Planning and Compositionality in Multimodality".
[2024/07] Talk at Adobe Research on "Visually Descriptive Language Modeling for Document Intelligence".
[2024/06] Talk at Apple NLU Workshop 2024 on "From Large Language Models to Large Agent Models".
[2024/06] Talk at UIUC NLP Seminar on "From Words to Worlds: A Close Look to Diffusion Models (through an NLP Lens)".
[2024/05] Talk at Midwest Machine Learning Symposium 2024 on "The Missing Knowledge in LLMs to Interact with the Physical World".
[2024/05] Tutorial at NAACL 2025 on Foundation Models Meet Embodied Agents [Website/Slides].
[2023/12] Keynote at NeurIPS 2023 Workshop on New Frontiers in Graph Learning on "Beyond the Beaten Path: Exploring the Role of Graphs in Multimodal Foundation Models".
[2023/11] Tutorial at ICAIF 2023 on Large Language Models for NLP in Finance [Slides].
[2023/10] Talk at Stanford Vision and Learning Seminar on "LLMs for robotics: Modeling the Knowledge of the Physical World".
[2023/10] Talk at Adobe Research on "Knowledge Foundation Models".
[2023/06] Talk at Stanford CogAI on "Modeling the Semantics of the Physical World".
[2023/06] Tutorial at CVPR 2023 on Knowledge-Driven Vision-Language Encoding [Website/Slides/Videos] [Reading List].
[2023/04] Talk "Towards Factuality in Information Access: Multimodal Knowledge Acquisition and Reasoning" at UCLA ECE.
[2023/03] Talk "Towards Factuality in Information Access: Multimodal Knowledge Acquisition and Reasoning" at University of Virginia CS; MBZUAI; Washington University in St. Louis CS; University of Toronto CS+ECE; UC Davis CS.
[2023/02] Talk "Towards Factuality in Information Access: Multimodal Knowledge Acquisition and Reasoning" at Carnegie Mellon University LTI; Northwestern University CS; Northeastern University ECE; Purdue University CS; Rice University CS; Virginia Tech CS; Max Planck Institute; UC San Diego ECE.
[2023/02] Tutorial at AAAI 2023 on Knowledge-Driven Vision-Language Pretraining [Website/Slides/Videos] [Reading List].
[2022/10] Talk "From Entity-Centric to Event-Centric Multimodal Event Knowledge Acquisition" at EECS Rising Stars, University of Texas at Austin.
[2022/10] Talk "Towards Accurate Intelligent Analysis: Event-Centric Multimedia Knowledge Extraction" at DARPA Forward (Invite-Only).
[2022/10] Talk "Event-Centric Multimedia Data Understanding" at Ohio State University; Singapore Management University; George Mason University; North Carolina State University.
[2022/09] Talk "Multimedia Event Extraction: From Object-Centric to Event-Centric" at Virginia Tech.
[2022/08] Talk "Event Knowledge Graph Construction" at LOGS Graph Reasoning Seminar.
[2022/07] Tutorial at NAACL 2022 on New Frontiers of Information Extraction [Website/Slides/Videos] [PDF].
[2022/06] Talk "Event Graph Structures in Vision-Language Understanding" at DataFun.
[2022/04] Talk "Connecting Vision and Text using Event Structures" at NewsBreak.
[2022/02] Talk "Memories as Repositories of Events: Structural Event Knowledge Acquisition" at University of Notre Dame.
[2021/12] Talk "Comprehensive Event Understanding in Multimedia Data" at USC ISI.
[2021/11] Talk "Structural Event Knowledge Acquisition from Multimedia Data" at UIUC NLP Seminar.
[2021/11] Talk "Event Extraction and Reasoning in Multimedia News Data" at Microsoft Research.
[2021/08] Talk "Improving Visual Event and Argument Role Understanding with Contrastive Image-Language Pretraining" at Microsoft Research.
[2021/08] Tutorial at ACL 2021 on Event-centric Natural Language Understanding [Website/Slides/Videos] [PDF].
[2021/02] Tutorial at AAAI 2021 on Event-centric Natural Language Processing [Website/Slides/Videos] [PDF].
[2020/10] Talk "Fine-Grained Knowledge Extraction System from Multimedia Data" at ai.science.
[2020/05] Talk "Event Understanding and Narration for Multimedia Data" at Intel MDI Research Lab.

We have a community effort on Foundation Models meet Embodied Agents

Tutorials: LLMs, VLMs, and VLAs, we give a comprehensive overview of the research based on Markov Decision Process (MDP) framework.
Challenges: Co-hosted with BEHAVIOR Challenge at NeurIPS 2025.
Workshops: we gather people from different communities including NLP, Vision, and Robotics to discuss the future of foundation models and embodied agents.

Publications

Academic Service

Organizing Committee
- ACL 2025 (Virtual Infrastructure Chairs)
- NAACL 2025 (Publication Chairs) summary
- EMNLP 2024 (Demonstration Chairs) summary
- Foundation Models Meet Embodied Agents @ CVPR 2025 workshop
- Towords Knowledgeable Foundation Models @ ACL 2025 workshop
- Towords Knowledgeable Foundation Models @ AAAI 2025 workshop
- Towords Knowledgeable Foundation Models @ ACL 2024 workshop
- Knowledge Discovery from Unstructured Data in Financial Services (KDF) @ SIGIR 2023 Workshop
Senior Area Chair: EMNLP (from 2025)
Area Chair: ACL (from 2023), EMNLP (from 2023), NAACL (from 2024), COLM (from 2025)
Program Committee: ARR (ACL Rolling Review, from 2021), ACL (from 2021), EMNLP (from 2021), NAACL-HLT (from 2021), AAAI (from 2021), WWW (from 2021), AKBC (2021), EACL (from 2021), KDD DI Workshop (2021), NLPCC (2021), COLING (from 2020), AACL (from 2020), CCL (from 2020)
Journal Reviewer: TACL, TIST, TKDE
Community Services
- ACL Student Research Workshop (SRW) Mentor, ACL, 2024
- ACM Mentor, ACM Mentorship Program at UIUC, 2022
- CS Ambassador, UIUC CS Visit Day for Prospective Graduate Students, 2022
- Advising Assistant, UIUC PhD Orientation Seminar, 2021
- Graduate Student Representative, UIUC CS Visit Day, 2020

Awards

Selected Awards

ACL Inaugural Best Desseratation Award Honorable Mention
MIT Tech Review 35 Under 35 Global List
Best Poster Award at Midwest Machine Learning Symposium 2025
Best Paper Award at RSS 2025 workshop on Continual Robot Learning from Humans
Best Paper Award at ICCV 2025 workshop on Structural Priors for Vision
Outstanding Paper Award at ACL 2024
Best Paper Award at SoCal NLP 2024
Best Demo Paper Award at NAACL 2021
Best Demo Paper Award at ACL 2020
AAAI 2025 New Faculty Highlights
Microsoft Research Postdoc Fellowship 2023
EE CS Rising Star 2022
DARPA Riser 2022
Microsoft Research PhD Fellowship 2021
Mavis Future Faculty Fellow
C.L. and Jane Liu Award
National Scholarship

Academic and Scientific Competitions

Ranked 1st in NIST TAC Streaming Multimedia Knowledge Base Population (SM-KBP) 2020
Ranked 1st in NIST TAC Streaming Multimedia Knowledge Base Population (SM-KBP) 2019, with more than 10% absolute gains compared the second ranked team

Teaching

Instructor:

COMP_SCI 396 Reasoning and Planning in the Foundation Model Era
NU, Winter 2025
COMP_SCI 496 Agentic AI
NU, Spring 2025

Guest Lecturer:

Navigating Up and Down in the Job Searching
COMP_SCI 496 Academic Job Search
NU, Fall 2023
Event-Centric Multimedia Encoding
CS 6604 Advanced Topics in Natural Language Processing
Virginia Tech, Fall 2022
Knowledge-Driven Vision-Language Pretraining
CS 546 Advanced Topics in Natural Language Processing
UIUC, Fall 2022
Recent Advances in Multimedia Encoding
CS 546 Advanced Topics in Natural Language Processing
UIUC, Fall 2022
Timeline Summarization: Introducing Temporal Dimensions into Summarization
CS 598 Knowledge Driven Natural Language Generation
UIUC, Spring 2022
Multimedia Encoding via Vision-Language Pretraining
CS 546 Advanced Topics in Natural Language Processing
UIUC, Fall 2021

Students

It is a great pleasure to work with such talented young people. I am grateful for the trust that they have placed in me.
>> Proud of what our students have achieved in their very first year!

PhD Students

Visiting PhD Students

Shiqi Chen (2024-)
Lei (Max) Zhang (2025-)

Join Us!

MLL Lab is 1 year old now! We offcially started in Oct 2024. We are growing and looking for more talented students and postdocs to join us!

Why Northwestern?

Northwestern University - A Prestigious Top 10 Institution

2026
#7 🏆

2025
#6 ⭐

2024
#9

2023
#10

2022
#9

2021
#9

US News National University Rankings

Good Research

Just in 2025:

We have 7 faculty members named as Sloan Research Fellows in 2025! Ranked as top 1 university having the most faculty in the 2025 cohort, together with MIT.

We have Joel Mokyr win Nobel Prize in economics in 2025! Also another winner Peter Howitt got his PhD from Northwestern.

Fun Facts

Mascot: Cat 🐱

School Color: Purple 💜

Campus Card: Wildcard 🎫

Money on it: Cat Cash 😸

Backyard: Two private beaches for students and faculty 🏖️

Lake-view: Floor-to-ceiling windows staring straight at Lake Michigan 🌊

Right downstairs: 6 yoga studios, 4 Pilates, Whole Foods, Trader Joe's, Paris Baguette, etc 🧘‍♀️

Wildcard Perks: Student discount pass to 300+ restaurants/stores in Evanston/Chicago (yes, really)

The personal touch: The only place that mailed me a customized gift after faculty interview: a mug with my name in 20+ fonts, all in purple 💜

Manling Li