NeFut Logo NeFut
Admin Login

[CS.AI] Elastic Queries Reinforcement Learning for VLA Models

Published at: 2026-06-16 22:00 Last updated: 2026-06-17 01:38
#Reinforcement Learning #Robot Manipulation #Vision-Language-Action

In robot manipulation, Vision-Language-Action (VLA) models serve as powerful action generators but typically operate under fixed inference and replanning schedules. This rigidity overlooks the varying difficulty of robot control: contact-rich or uncertain states may require more computation and fresher feedback, while simpler states can be managed with fewer inference steps and longer open-loop execution. To address this, we propose Elastic Queries Reinforcement Learning (EQRL), a framework that makes each VLA policy query elastic.

EQRL employs a lightweight latent-schedule adapter that jointly selects the latent input, denoising budget, and action chunk length without fine-tuning the underlying VLA model. To enable difficulty-aware scheduling, EQRL trains a critic that derives a state difficulty signal from the disagreement among critic ensembles. This signal directs computation towards difficult states, while a learned residual allows for task-driven corrections.

We formalize variable chunk execution as query-level macro-action RL with chunk-dependent discounting and an amortized number-of-function-evaluations (NFE) budget. Across simulations and real-robot manipulation, EQRL reduces amortized inference costs while maintaining or improving task success rates.

Blogger's Review: The introduction of EQRL addresses the flexibility issue of VLA models in practical applications, significantly enhancing efficiency and success rates in robot manipulation through dynamic scheduling and difficulty awareness. This approach not only simplifies the execution process but also opens new avenues for intelligent decision-making in robotics, highlighting its potential in complex environments.

Original Source: https://arxiv.org/abs/2606.14375

[h] Back to Home