Learning When to Continue Search in Multi-round RAG through Self-Practicing

0 2 minutes read

[Submitted on 5 May 2025 (v1), last revised 30 Jun 2025 (this version, v2)]

Watch the PDF file from the paper entitled, knowing that you do not know: Learn when you continue to search in a multi -cycle rag through self -practices, by Diji Yang and 3 other authors

PDF HTML (experimental) view

a summary:The RAG has a strong ability to enhance the knowledge of the language models and reduce the obstetric hallucinations from artificial intelligence, which leads to its widespread use. However, the complex tasks that require a multi -tall retrieval still difficult, and early attempts tend to be optimistic excessively without a good feeling of self -doubt. The current multi -sewing rag systems may continue to search even when adequate information is already, or incorrect answers may be provided without getting enough information or knowledge. Current solutions require either large quantities of data overseeing data that are permitted for humans or lead to the performance of SubPar. This paper aims to address these restrictions by introducing a new working framework, SIM-RAG, to explicitly enhance RAG self-awareness and multi-cycles. To train SIM-RAG, we first allowed the Rack system to represent multi-cycle defects, which increases the husbands’ answers to current questions with interior monologue thinking steps to generate artificial training data. For each pair, the system may explore multiple retrieval paths, which are classified as successful if it reaches the correct and unpleasant answer otherwise. Using this data, we train a critical of light information. At the time of reasoning, the critic evaluates whether the rag system has regained sufficient information in each round, directing retrieval decisions and improving self -awareness at the system level through augmented learning in the context. Experiments through multiple prominent rag standards show that SIM-RAG is an effective multi-cyclist solution. Moreover, this framework is effective in the system, which adds a lightweight component to a rag without the need for adjustments to LLMS or current search engines, and data effective, which eliminates the need for data supervision data for the mid -step recovery process.