Publications


2025 Publications

The list may not be up-to-date. Please find my latest publications on Google Scholar.



RAGEN (RL-Agent): Training Agents by Reinforcing Reasoning [Website][PDF][Code][Experimental Logs][td;lr]
Zihan Wang*, Kangrui Wang*, Qineng Wang*, Pingyue Zhang*, Linjie Li*, Zhengyuan Yang, Kefan Yu, Minh Nhat Nguyen, Monica Lam, Yiping Lu, Kyunghyun Cho, Jiajun Wu, Li Fei-Fei, Lijuan Wang, Yejin Choi, Manling Li
Best Poster Award at MMLS 2025 (Midwest Machine Learning Symposium)
2.3k+ Github Stars, Featured by MIT Tech Review, Lambda Partner Spotlight, VentureBeat, Medium, AI News, MarkTechPost, Business Leaders Review, etc.

VAGEN: Reinfocing World Model Reasoning for Multi-Turn VLM Agents [PDF][Blog][Code][td;lr]
Kangrui Wang*, Pingyue Zhang*, Zihan Wang*, Yaning Gao*, Linjie Li*, Qineng Wang, Chi Wan, Hanyang Chen, Yiping Lu, Zhengyuan Yang, Lijuan Wang, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Yejin Choi, Manling Li
NeurIPS 2025
Featured by MIT Tech Review

Exploring Diffusion Transformer Designs via Grafting [Website][PDF][Blog][Code][td;lr]
Keshigeyan Chandrasegaran*, Michael Poli*, Daniel Y. Fu, Dongjun Kim, Lea M. Hadzic, Manling Li, Agrim Gupta, Stefano Massaroli, Azalia Mirhoseini, Juan Carlos Niebles, Stefano Ermon, Li Fei-Fei
NeurIPS 2025 (Oral, Top 0.36%)

Spatial Mental Modeling from Limited Views [Website][PDF][Data][Code][td;lr]
Qineng Wang*, Baiqiao Yin*, Pingyue Zhang, Jianshu Zhang, Kangrui Wang, Zihan Wang, Jieyu Zhang, Keshigeyan Chandrasegaran, Han Liu, Ranjay Krishna, Saining Xie, Jiajun Wu+, Li Fei-Fei+, Manling Li+
The Best of ICCV 2025, featured by Voxel 51
Best Paper Award at ICCV 2025 Workshop on Structural Priors for Vision

ROSETTA: Constructing Code-Based Reward from Unconstrained Language Preference [Website][PDF][Data][Code][td;lr]
Sanjana Srivastava*, Kangrui Wang*, Yung-Chieh Chan*, Tianyuan Dai, Manling Li, Ruohan Zhang, Mengdi Xu, Jiajun Wu, Li Fei-Fei
Best Paper Award at RSS 2025 on Continual Robot Learning from Humans

EmbodiedBench: Comprehensive Benchmarking Multi-modal Large Language Models for Vision-Driven Embodied Agents [Website][PDF][Code]
Rui Yang, Hanyang Chen, Junyu Zhang, Mark Zhao, Cheng Qian, Kangrui Wang, Qineng Wang, Teja Venkat Koripella, Marziyeh Movahedi, Manling Li, Heng Ji, Huan Zhang, Tong Zhang
ICML 2025 (Oral, Top 1%)

Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas [PDF][Code][Data]
Shiqi Chen, Tongyao Zhu, Ruochen Zhou, Jinghan Zhang, Siyang Gao, Juan Carlos Niebles, Mor Geva, Junxian He, Jiajun Wu, Manling Li
ICML 2025

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging [Website][PDF][Code][Data]
Shiqi Chen, Jinghan Zhang, Tongyao Zhu, Wei Liu, Siyang Gao, Miao Xiong, Manling Li, Junxian He
ICML 2025

SyncMind: Measuring Agent Out-of-Sync Recovery in Collaborative Software Engineering [Website][PDF][Code][Data]
Xuehang Guo, Xingyao Wang, Yangyi Chen, Sha Li, Chi Han, Manling Li, Heng Ji
ICML 2025

ENACT: Evaluating Embodied Cognition with World Modeling of Egocentric Interaction [Website][PDF][Code][Data] [td;lr]
Qineng Wang*, Wenlong Huang*, Yu Zhou, Hang Yin, Tianwei Bao, Jianwen Lyu, Weiyu Liu, Ruohan Zhang, Jiajun Wu, Li Fei-Fei, Manling Li
arXiv

Internalizing World Models via Self-Play Finetuning for Agentic RL [PDF][Code][Data]
Shiqi Chen, Tongyao Zhu, Zian Wang, Jinghan Zhang, Kangrui Wang, Siyang Gao, Teng Xiao, Yee Whye Teh, Junxian He, Manling Li
arXiv

Theory of Space: Actively Constructing Spatial Beliefs in Foundation Models [PDF][Data]
Pingyue Zhang*, Zihan Huang*, Yue Wang *, Jieyu Zhang*, Letian Xue, Zihan Wang, Qineng Wang, Keshigeyan Chandrasegaran, Ruohan Zhang, Yejin Choi, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Manling Li
arXiv

Unary Feedback as Observation: A Simple “Try Again” Can Elicit Multi-Turn LLM Reasoning [PDF][Code][td;lr]
Licheng Liu, Zihan Wang, Linjie Li, Chenwei Xu, Yiping Lu, Han Liu, Avirup Sil, Manling Li
arXiv

ODE-Steer: Activation Steering for LLM Alignment via a Unified ODE-Based Framework [Website][PDF][Code][td;lr]
Hongjue Zhao, Haosen Sun, Jiangtao Kong, Xiaochang Li, Qineng Wang, Liwei Jiang, Qi Zhu, Tarek F. Abdelzaher, Yejin Choi, Manling Li+, Huajie Shao+
arXiv

ERA: Embodied Reasoning Agents via Reinforcement Learning [Website][PDF][Code][Data]
Hanyang Chen, Mark Zhao, Rui Yang, Qinwei Ma, Ke Yang, Jiarui Yao, Kangrui Wang, Hao Bai, Zhenhailong Wang, Rui Pan, Mengchao Zhang, Jose Barreiros, Aykut Onol, ChengXiang Zhai, Heng Ji, Manling Li, Huan Zhang, Tong Zhang
arXiv

SENTINEL: A Multi-Level Formal Framework for Safety Evaluation of LLM-based Embodied Agents [PDF]
Simon Sinong Zhan, Yao Liu, Philip Wang, Zinan Wang, Qineng Wang, Zhian Ruan, Xiangyu Shi, Xinyu Cao, Frank Yang, Kangrui Wang, Huajie Shao, Manling Li, Qi Zhu
arXiv

T*: Re-thinking Temporal Search for Long-Form Video Understanding [Website][PDF][Data][Code]
Jinhui Ye*, Zihan Wang*, Haosen Sun, Keshigeyan Chandrasegaran, Zane Durante, Cristobal Eyzaguirre, Yonatan Bisk, Juan Carlos Niebles, Ehsan Adeli, Li Fei-Fei, Jiajun Wu, Manling Li
CVPR 2025, Oral at ICCV 2025 Workshop on Long Multi-Scene Video Foundations

LayoutVLM: Differentiable Optimization of 3D Layout via Vision-Language Models [Website][PDF][Code]
Fan-Yun Sun, Weiyu Liu, Siyi Gu, Dylan Lim, Goutam Bhat, Federico Tombari, Manling Li, Nick Haber, Jiajun Wu
CVPR 2025

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models [PDF]
Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu
ICLR 2025

Visually Descriptive Language Modeling for Vector Graphics Reasoning [PDF][Website][Code]
Zhenhailong Wang, Joy Hsu, Xingyao Wang, Kuan-Hao Huang, Manling Li, Jiajun Wu, Heng Ji
TMLR

The Law of Knowledge Overshadowing: Towards Understanding, Predicting and Preventing LLM Hallucination [PDF]
Yuji Zhang, Sha Li, Cheng Qian, Jiateng Liu, Pengfei Yu, Chi Han, Yi Fung, Kathleen McKeown, ChengXiang Zhai, Manling Li, Heng Ji
ACL 2025 Findings

ACLED-DS: A Large Multilingual Expert-Annotated Abstractive Event Dataset for the Real World [PDF]
Sina Semnani, Pingyue Zhang, Wanyue Zhai, Haozhuo Li, Ryan Beauchamp, Trey Billing, Katayoun Kishi, Manling Li, Monica Lam
ACL 2025 Findings

Chain-of-Experts: Unlocking the Communication Power of MoEs [PDF][Blog][Code][td;lr]
Zihan Wang, Rui Pan, Jiarui Yao, Róbert Csordás, Linjie Li, Lu Yin, Jiajun Wu, Tong Zhang, Manling Li, Shiwei Liu

Foundation Models Meet Embodied Agents [Website/Slides/Videos]
Manling Li, Yunzhu Li, Jiayuan Mao, Wenlong Huang
AAAI 2025: Tutorial
NAACL 2025: Tutorial
ICCV 2025: Tutorial