A unified policy that performs standing, walking, and running with smooth transitions, via a gait-conditioned reward routing mechanism and a multi-phase curriculum. Real-world deployment on Unitree G1.
A unified recurrent policy with gait encoding and a structured gait/velocity curriculum. Gait-aware losses reduce interference across modes and stabilize multi-gait learning.
The gait mask routes gait-specific objectives (e.g., contact, push-off, stillness) while regulation terms remain shared, mitigating reward interference among run/walk/stand.
@inproceedings{peng2025gaitconditioned,
title = {Gait-Conditioned Reinforcement Learning with Multi-Phase Curriculum for Humanoid Locomotion},
author = {Peng, Tianhu and Bao, Lingfan and Zhou, Chengxu},
booktitle = {IEEE-RAS International Conference on Humanoid Robots},
year = {2025},
}