Imitation-Relaxation Reinforcement Learning for Sparse Badminton Strikes via Dynamic Trajectory Generation

Yanyan Yuan1, Yucheng Tao1, Shaowen Cheng2, Yanhong Liang1, Yongbin Jin2, Hongtao Wang1
1Zhejiang University, 2ZJU-Hangzhou Global Scientific and Technological Innovation Center
Subumitted to Frontiers in Neurorobotics
Human-Robot multi-round badminton striking
Generalization of the algorithm on UR5
Abstract
Robotic racket sports present exceptional benchmarks for evaluating dynamic motion control capabilities of robots. Due to the highly nonlinear dynamics of shuttlecock, stringent demands on robot's dynamic responses, convergence difficulties due to sparse reward in reinforcement learning, badminton strikes remain a formidable challenge for robot systems. To address these, this work proposes DTG-IRRL, a novel learning framework for badminton strikes, which integrates imitation-relaxation reinforcement learning with dynamic trajectory generation. The framework demonstrates significantly improved training efficiency and performance, achieving faster convergence and twice higher landing accuracy. Analysis of the reward function within a specific parameter space hyperplane intuitively reveals the convergence difficulties arising from inherent sparse reward in racket sports and demonstrates the framework's effectiveness in mitigating local and slow convergence. Implemented on hardware with zero-shot transfer, the framework achieves a 90% hitting rate and a 70\% landing accuracy, enabling sustained human-robot rallies. Cross-platform validation using UR5 robot demonstrates framework's generalizability while highlighting the requirement for high dynamic performance of robotic arm in racket sports.
Method
Algorithm Framework