AI

Model-Based Sequence Reinforcement Learning for Model-Free Control

View the PDF file from the paper entitled Overcoming the slow decision frequencies in continuous control: Learning to reinforce the model -based sequence for models -free control, by Devdhar Patel and 1 other authors

PDF HTML (experimental) view

a summary:Learning learning reinforcement (RL) quickly and transcends human level control capabilities. However, modern RL algorithms often require time and reaction times much faster than human capabilities, which are not practical in the real world settings and usually require specialized devices. We offer learning to enhance sequences (SRL), a RL algorithm designed to produce a series of measures for a specific entry state, allowing effective control of low decision frequencies. SRL deals with the challenges of learning sequence by employing both the model and grammatical architecture that works on different time standards. We propose a “chronological summons” mechanism, as the critic uses the model to estimate the intermediate cases between primitive procedures, and to provide an educational signal for every individual procedure within the sequence. Once the training is completed, the actor can generate the sequence of procedures independently of the model, and to achieve the control -free control at a slower frequency. We evaluate the SRL on a set of continuous control tasks, indicating that it achieves a similar performance of modern algorithms while significantly reducing the actor’s sample. To better evaluate performance via various decision frequencies, we offer a medium -frequency scale (FAS). Our results show that SRL greatly outperforms the traditional RL algorithms in terms of FAS, which makes them particularly suitable for applications that require changing decision frequencies. Moreover, we compare SRL with online -based online planning, which indicates that SRL achieves a similar FAS with the use of the same model during the training used by the Internet planners to plan.

The application date

From: Devdhar Patel [view email]
[v1]

Fri, 11 Oct 2024 16:54:07 UTC (5,603 KB)
[v2]

Friday, 18 Oct 2024 14:35:53 UTC (5,603 KB)
[v3]

Tuesday, 4 Mar 2025 03:11:25 UTC (13,431 KB)
[v4]

Tuesday, 15 July 2025 16:07:53 UTC (9,754 KB)
[v5]

Saturday, 26 Jul 2025 05:13:25 UTC (9,754 KB)

Don’t miss more hot News like this! AI/" target="_blank" rel="noopener">Click here to discover the latest in AI news!

2025-07-29 04:00:00

Related Articles

Back to top button