Hello, welcome

/MyPhoto/MyPhoto.jpeg

I’m a Ph.D. student in Computer Science at ETH Zurich, advised by Prof. Niao He. Before that, I was a Ph.D. student in Computer Science at University of Illinois at Urbana–Champaign (UIUC), advised by Prof. Nan Jiang. I completed my B.E. in Computer Science at Beihang University.

My research primarily focuses on reinforcement learning (RL) and, more broadly, sequential decision-making under uncertainty. I work on understanding the fundamental mathematical principles underlying the problems and leveraging theoretical insights to develop efficient and practical algorithms. I’m particularly interested in bridging the gap between theory and practice, designing algorithms that come with theoretical supports and demonstrate empirical performance.

My previous research spans a broad spectrum of topics including:

  • Reinforcement Learning from Human Feedback (RLHF): Developing methods to align AI with human preferences.
  • Multi-Agent Reinforcement Learning (MARL): Understanding learning efficiency in multi-agent systems.
  • Offline Reinforcement Learning: Advancing learning algorithms in offline setting.

Contacts: Google Scholar   |   LinkedIn   |   Github   |   jiawei.huang [at] inf [dot] ethz [dot] ch



Research Highlights


Reinforcement Learning from Human Feedback

Sample efficiency is crucial in online RLHF. While previous works focus on strategic exploration for sample-efficient learning, we study the benefits by transfer learning.
  1. Preprint
    Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective
    Jiawei Huang, Bingcong Li, Christoph Dann, and Niao He
    Preprint 2025

MARL and Game Theory

Learning equilibrium policy in large-population systems is challenging in general. Our ICML 2024 paper studies a class of large-population called Mean-Field Games (MFGs). Due to its special symmetric structure, we show that learning in MFGs is actually not much harder than single-agent RL.
  1. ICML 2024
    Model-Based RL for Mean-Field Games is not Statistically Harder than Single-Agent RL
    Jiawei Huang, Niao He, and Andreas Krause
    International Conference on Machine Learning, 2024
Typical agents' learning dynamics do not always lead to desirable outcomes. Our ICLR 2025 paper studies the steering setup, where agents' learning dynamics can be influenced by external steering rewards (e.g. financial subsidy by government). We explore how to design these rewards to efficiently guide the agents towards desired policies.
  1. ICLR 2025
    Learning to Steer Markovian Agents under Model Uncertainty
    Jiawei Huang, Vinzenz Thoma, Zebang Shen, Heinrich H. Nax, and Niao He
    International Conference on Learning Representations, 2025

Others

Early in my Ph.D., I explored various topics in single-agent online/offline RL. Motivated by the practical policy switching constraints, our ICLR 2022 paper introduces the deployment-efficient setup, and develops efficient algorithms that match our established lower bounds.
  1. ICLR 2022
    (Spotlight)
    Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
    Jiawei Huang, Jinglin Chen, Li Zhao, Tao Qin, Nan Jiang, and Tie-Yan Liu
    International Conference on Learning Representations, 2022