Imitation-Relaxation Reinforcement Learning for Sparse Badminton Strikes via Dynamic Trajectory Generation

Abstract

Robotic racket sports present exceptional benchmarks for evaluating dynamic motion control capabilities of robots. Due to the highly nonlinear dynamics of shuttlecock, stringent demands on robot's dynamic responses, convergence difficulties due to sparse reward in reinforcement learning, badminton strikes remain a formidable challenge for robot systems. To address these, this work proposes DTG-IRRL, a novel learning framework for badminton strikes, which integrates imitation-relaxation reinforcement learning with dynamic trajectory generation. The framework demonstrates significantly improved training efficiency and performance, achieving faster convergence and twice higher landing accuracy. Analysis of the reward function within a specific parameter space hyperplane intuitively reveals the convergence difficulties arising from inherent sparse reward in racket sports and demonstrates the framework's effectiveness in mitigating local and slow convergence. Implemented on hardware with zero-shot transfer, the framework achieves a 90% hitting rate and a 70\% landing accuracy, enabling sustained human-robot rallies. Cross-platform validation using UR5 robot demonstrates framework's generalizability while highlighting the requirement for high dynamic performance of robotic arm in racket sports.

UR5

UR5 Multi-rally