CS 496 Agent AI

Spring 2025

Time: Monday 2:00pm-4:50pm, Apr 1-Jun 7, 2025
Location: Technological Institute L160, Over zoom for some external talks, project presentations and discussions
Instructor: Prof. Manling Li (Email: manling.li@northwestern.edu)
TA: Jiahao Yu (Email: jiahao.yu@northwestern.edu)
Instructor and TA Office Hours: Instructor office hour is on Monday 9:00am-10:00am (in-person), may change to zoom due to travel schedules. TA office hours are on Monday and Wednesday over zoom (Please contact TA jiahao.yu@northwestern.edu about it)
Course Google Folder: announced on Canvas.
Assignment Submission: on Canvas: https://canvas.northwestern.edu/courses/230363

Course Summary: This comprehensive course explores two major categories of AI agents: web-based agents that interact with digital environments and embodied agents that operate in physical spaces. Students will learn to design and implement both types of agents, understanding their unique challenges and capabilities, while mastering the integration of LLMs with various interaction modalities. Prerequisites

Introduction to Machine Learning
Python Programming
Basic Robotics or Computer Vision
Linear Algebra
Probability and Statistics

Students who complete this course will be able to:

Design web agents that navigate digital environments
Design embodied agents for physical interaction
Create robust perception and action systems
Control decision

Course Syllabus:

Week	Topic	Details
Week 1	Introduction to Agent AI (based on MDP)	Definition and Overview of Agents Markov Decision Process (MDP) Agent Formulation based on MDP Role of Large Language Models (LLMs) Goal Interpretation State Estimation MDP Policy Reward Modeling World Modeling
Week 2	LLM Agent	Agent Architectures Self-supervised Finetuning Reinforcement learning in agent control. Inference Time Scaling for agents. LLMs in Agent Learning Agent - Memory Agent - Tools Agent - Planning
Week 3	Reasoning and Planning in Agent Models	Introduction to Reinforcement Learning (Version 0, 1, 2, 3, 4) PPO DPO GRPO
Week 4	Benchmarking and Evaluation	Evaluation Metrics: Task performance, generalization, and safety. Ethical considerations in agent evaluation. Benchmark Datasets and Tasks Web agent benchmarks: WebArena, WebGPT, CrawlBot. Embodied benchmarks: BEHAVIOR, Habitat, ALFRED.
Week 5	Web Agents	Web-Based Environments for Agents: HTML, APIs, and web crawling. Examples of Web Agents LLMs in Web Agents Context understanding and dialogue generation. Fine-tuning LLMs for domain-specific agents. Challenges: Multimodal web understanding.
Week 6	Embodied Agents	Foundations of Embodied Intelligence: What defines an embodied agent? Simulated Environments: OpenAI Gym, Habitat, BEHAVIOR, MuJoCo. Tasks for embodied agents (navigation, manipulation). Examples of Embodied Agents: Human-robot interaction; Autonomous driving and drones. LLMs in Embodied Agents Embedding reasoning and planning into embodied systems. Hierarchical decision-making with LLMs. Challenges: Grounding language in physical environments.
Week 7	Embodied Agents Advanced Topics	Diffusion Models Vision-Language-Action Models Large World Models
Week 8	Multi-Agent Systems	Multi-agent collaboration and negotiation. Emergent behavior in multi-agent settings. Multi-Agent Planning Task allocation and shared planning. Auction-based and distributed algorithms. Examples: Swarm robotics, multi-user web agents.
Week 9	Ethics and Safety in Agent AI	Bias in agent decision-making. Social Norms in Agent AI Trustworthiness: Ensuring transparency and accountability. Scaling agents for real-world applications.
Week 10	Final Project Presentations	Building a web agent with LLM integration. Creating embodied agents in simulated environments. Evaluate agent performance using real-world scenarios.

Grading:

Weekly Reading
- 20pts in total
- Submit a paragraph for one paper each week.
Mid-Term Exams
- 30pts in total
- Openbook exam, about open-end questions regarding three papers.
Term Project
- 50 pts in total, 8pts project proposal (5pts report, 3pts lightning talk), 12pts mid-term project report (8pts report, 4pts milestone presentation), 30pts final project report (20pts report, 10pts presentation).
- The instructor will give 10 topics for the students to choose from. Students are expected to do self-teaming and each team should consist of 3-6 students. Everyone is encouraged to submit papers based on the term projects. Project score will by default be the same for all team members, but some team members can get a higher or lower score than the team score based on individual performance that is assessed in two ways: (1) checking contribution to final deliverables (e.g., Git commits and Final Project Report), and (2) Instructor and TAs’ opinion from project presentations.

Manling Li

Assistant Professor

I study reasoning and planning in multimodal foundation models.