A Coding Guide to Build Modular and Self-Correcting QA Systems with DSPy

0 5 minutes read

A Coding Guide to Build Modular and Self Correcting QA Systems.png

In this tutorial, we explore how to build a smart renewal and self -correction system using DSPY Working framework, compact with Gueini 1.5 Flash model. We start by identifying organized signatures that clearly determine the behavior of inputs and outputs, which DSPY uses as a basis for building reliable pipelines. Through the DSPY declaration approach, we build training units, such as AdvanceDQA and Simplerag, to answer questions using both context and the generation of retrieval. By combining DSPY units and with the strong logic of Gemini, we formulate the AI system capable of providing accurate and step -by -step answers. With our progress, we also take advantage of DSPY improvement tools, such as BootstraPfewshot, to automatically enhance performance examples.

!pip install dspy-ai google-generativeai


import dspy
import google.generativeai as genai
import random
from typing import List, Optional


GOOGLE_API_KEY = "Use Your Own API Key"  
genai.configure(api_key=GOOGLE_API_KEY)


dspy.configure(lm=dspy.LM(model="gemini/gemini-1.5-flash", api_key=GOOGLE_API_KEY))

We start installing the required libraries, DSPY for the AI identification lines, and Google-Henerativei to reach Google’s Gemini models. After importing the necessary stereotypes, we create Gemini using our API key. Finally, we prepared DSPY to use the Gemini 1.5 Flash model as a linguistic back interface.

class QuestionAnswering(dspy.Signature):
    """Answer questions based on given context with reasoning."""
    context: str = dspy.InputField(desc="Relevant context information")
    question: str = dspy.InputField(desc="Question to answer")
    reasoning: str = dspy.OutputField(desc="Step-by-step reasoning")
    answer: str = dspy.OutputField(desc="Final answer")


class FactualityCheck(dspy.Signature):
    """Verify if an answer is factually correct given context."""
    context: str = dspy.InputField()
    question: str = dspy.InputField()
    answer: str = dspy.InputField()
    is_correct: bool = dspy.OutputField(desc="True if answer is factually correct")

We define two DSPY signatures to regulate the inputs and outputs of our system. First, Assecctionanering expects context and question, and restores both thinking and final answer, allowing the model to explain the thinking process. After that, Factalatycheck is designed to verify the sincerity of the answer through a simple logical re -logical, which helps us build a self -service guarantee system.

class AdvancedQA(dspy.Module):
    def __init__(self, max_retries: int = 2):
        super().__init__()
        self.max_retries = max_retries
        self.qa_predictor = dspy.ChainOfThought(QuestionAnswering)
        self.fact_checker = dspy.Predict(FactualityCheck)
       
    def forward(self, context: str, question: str) -> dspy.Prediction:
        prediction = self.qa_predictor(context=context, question=question)
       
        for attempt in range(self.max_retries):
            fact_check = self.fact_checker(
                context=context,
                question=question,
                answer=prediction.answer
            )
           
            if fact_check.is_correct:
                break
               
            refined_context = f"{context}nnPrevious incorrect answer: {prediction.answer}nPlease provide a more accurate answer."
            prediction = self.qa_predictor(context=refined_context, question=question)
       
        return prediction

We create an advancedqa unit to add self -correction capacity to our quality guarantee system. Firstly, it is used as heads to create an answer with thinking. Next, realistic accuracy is verified using the prediction of fact examination. If the answer is incorrect, then we transfer the context and re -attempt, even a specific number of times, to ensure more reliable outputs.

class SimpleRAG(dspy.Module):
    def __init__(self, knowledge_base: List[str]):
        super().__init__()
        self.knowledge_base = knowledge_base
        self.qa_system = AdvancedQA()
       
    def retrieve(self, question: str, top_k: int = 2) -> str:
        # Simple keyword-based retrieval (in practice, use vector embeddings)
        scored_docs = []
        question_words = set(question.lower().split())
       
        for doc in self.knowledge_base:
            doc_words = set(doc.lower().split())
            score = len(question_words.intersection(doc_words))
            scored_docs.append((score, doc))
       
        # Return top-k most relevant documents
        scored_docs.sort(reverse=True)
        return "nn".join([doc for _, doc in scored_docs[:top_k]])
   
    def forward(self, question: str) -> dspy.Prediction:
        context = self.retrieve(question)
        return self.qa_system(context=context, question=question)

We build a Simplerg unit to simulate the generation of the retrieval using DSPY. We offer the base of knowledge and implement a basic maintenance based on keywords to bring the most relevant documents with a specific question. These documents act as a context of the Advancedqa unit, which then performs logic and self -correction to cancel an accurate answer.

knowledge_base = [
    “Use Your Context and Knowledge Base Here”
]


training_examples = [
    dspy.Example(
        question="What is the height of the Eiffel Tower?",
        context="The Eiffel Tower is located in Paris, France. It was constructed from 1887 to 1889 and stands 330 meters tall including antennas.",
        answer="330 meters"
    ).with_inputs("question", "context"),
   
    dspy.Example(
        question="Who created Python programming language?",
        context="Python is a high-level programming language created by Guido van Rossum. It was first released in 1991 and emphasizes code readability.",
        answer="Guido van Rossum"
    ).with_inputs("question", "context"),
   
    dspy.Example(
        question="What is machine learning?",
        context="ML focuses on algorithms that can learn from data without being explicitly programmed.",
        answer="Machine learning focuses on algorithms that learn from data without explicit programming."
    ).with_inputs("question", "context")
]

We define a small knowledge base that contains various facts through various topics, including date, programming and science. This is the source of our context to retrieve. Besides, we prepare a set of examples of training to direct the DSPY improvement process. Each example includes a question, related context, and the correct answer, which helps our system learn how to respond more accurately.

def accuracy_metric(example, prediction, trace=None):
    """Simple accuracy metric for evaluation"""
    return example.answer.lower() in prediction.answer.lower()


print("🚀 Initializing DSPy QA System with Gemini...")
print("📝 Note: Using Google's Gemini 1.5 Flash (free tier)")
rag_system = SimpleRAG(knowledge_base)


basic_qa = dspy.ChainOfThought(QuestionAnswering)


print("n📊 Before Optimization:")
test_question = "What is the height of the Eiffel Tower?"
test_context = knowledge_base[0]
initial_prediction = basic_qa(context=test_context, question=test_question)
print(f"Q: {test_question}")
print(f"A: {initial_prediction.answer}")
print(f"Reasoning: {initial_prediction.reasoning}")


print("n🔧 Optimizing with BootstrapFewShot...")
optimizer = dspy.BootstrapFewShot(metric=accuracy_metric, max_bootstrapped_demos=2)
optimized_qa = optimizer.compile(basic_qa, trainset=training_examples)


print("n📈 After Optimization:")
optimized_prediction = optimized_qa(context=test_context, question=test_question)
print(f"Q: {test_question}")
print(f"A: {optimized_prediction.answer}")
print(f"Reasoning: {optimized_prediction.reasoning}")

We start by determining a slight accuracy scale to verify whether the expected answer contains the correct response. After preparing the Simplerag system and the foundation line, we test it on a sample question before any improvement. Next, using the BootstraPDEWSHOT DSPY improved, we set the quality guarantee system with our training examples. This model enables the creation of more effective claims automatically, which improves accuracy, which we verify by comparing and and after the responses.

def evaluate_system(qa_module, test_cases):
    """Evaluate QA system performance"""
    correct = 0
    total = len(test_cases)
   
    for example in test_cases:
        prediction = qa_module(context=example.context, question=example.question)
        if accuracy_metric(example, prediction):
            correct += 1
   
    return correct / total


print(f"n📊 Evaluation Results:")
print(f"Basic QA Accuracy: {evaluate_system(basic_qa, training_examples):.2%}")
print(f"Optimized QA Accuracy: {evaluate_system(optimized_qa, training_examples):.2%}")


print("n✅ Tutorial Complete! Key DSPy Concepts Demonstrated:")
print("1. 🔤 Signatures - Defined input/output schemas")
print("2. 🏗️  Modules - Built composable QA systems")
print("3. 🔄 Self-correction - Implemented iterative improvement")
print("4. 🔍 RAG - Created retrieval-augmented generation")
print("5. ⚡ Optimization - Used BootstrapFewShot to improve prompts")
print("6. 📊 Evaluation - Measured system performance")
print("7. 🆓 Free API - Powered by Google Gemini 1.5 Flash")

We run an advanced experimental offer by asking multiple questions across different fields. For each question, the Simplerag system reclaims the most relevant context and then uses the AdvanceDQa Self -Correction Unit to create a well -affordable answer. We print the answers along with a preview of the cause, and show how DSPY combines retrieval and the studied generation to provide reliable responses.

In conclusion, we have successfully shown the full potential of DSPY to build advanced quality guarantee pipelines. We see how DSPY simplifies the design of smart units with clear facades, supports self -correction rings, and integrate the basic retrieval, and improvement enables few kitchens with minimal software instructions. With only a few lines, we create and evaluate our models using examples in the real world, and measure performance gains. This practical experience shows how DSPY enables us, when combined with a goini applications interface from Google, from advanced language applications, tested and tested without a boiler or complex logic.

verify Symbols. All the credit for this research goes to researchers in this project. Also, do not hesitate to follow us twitterand YouTube and Spotify And do not forget to join 100K+ ML Subreddit And subscribe to Our newsletter.

SANA Hassan, consultant coach at Marktechpost and a double -class student in Iit Madras, is excited to apply technology and AI to face challenges in the real world. With great interest in solving practical problems, it brings a new perspective to the intersection of artificial intelligence and real life solutions.

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-07-06 06:59:00

0 5 minutes read