Tracing OpenAI Agent Responses using MLFlow

MLFlow is an open source platform for managing and tracking machine learning experiments. When using it with Openai Agents SDK, MLFlow automatically:
- It records all API’s agents and calls
- It picks up the use of tools, input/directing messages and intermediate decisions
- Paths work to correct errors, performance analysis and cloning
This is especially useful when you build multi -agent systems where different agents or dynamic communication functions cooperate
In this tutorial, we will go through two main examples: a simple delivery between agents, and the use of working handrails – all while tracking their behavior using MLFlow.
Preparation of dependencies
Install libraries
pip install openai-agents mlflow pydantic pydotenv
API Openai key
To get the Openai API key, please visit https://platform.epenai.com/Settes/organization/API- KEYS and create a new key. If you are a new user, you may need to add details of the bills setting and pay the minimum $ 5 to activate the API access.
Once you create the key, create a. ENV file and enter the following:
Replace
Multi_ge_demo.py)
In this text program (Multi_gent_demo.py), we build a simple multi -agent assistant using Openai Agents SDK, designed to direct user information to a coding expert or cooking expert. We are two mlfow.openai.autug ()Which track and register all the agent’s reactions automatically with the API Openai – including inputs, outputs, and procedures of the agent – which makes it easy to monitor and correct the system. MLFlow has been formed to use URI tracking local files (./mlrunsAll activity is recorded under the name of the experiment.Agent – coding – power of attorney“.
import mlflow, asyncio
from agents import Agent, Runner
import os
from dotenv import load_dotenv
load_dotenv()
mlflow.openai.autolog() # Auto‑trace every OpenAI call
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Coding‑Cooking")
coding_agent = Agent(name="Coding agent",
instructions="You only answer coding questions.")
cooking_agent = Agent(name="Cooking agent",
instructions="You only answer cooking questions.")
triage_agent = Agent(
name="Triage agent",
instructions="If the request is about code, handoff to coding_agent; "
"if about cooking, handoff to cooking_agent.",
handoffs=[coding_agent, cooking_agent],
)
async def main():
res = await Runner.run(triage_agent,
input="How do I boil pasta al dente?")
print(res.final_output)
if __name__ == "__main__":
asyncio.run(main())
MLFlow UI
To open the MLFlow user interface and display all registered agent reactions, turn on the next thing in a new station:
This will start a MNFlow track http: // Localhost: 5000 By default.


We can display the fully interacting flow in Trace Section – From the initial inputs of the user to how the assistant directs the request to the appropriate agent, and finally, the response that was created by this agent. This comprehensive tracking provides an insightful look at decisions, paths and outputs, which helps you correct the functioning of your agent.
Graasrails.py
In this example, we implement the support agent of the handrail customer using the Openai Agents SDK with MLFlow Tracking. The agent is designed to help users in general inquiries but is restricted to answering the questions related to the doctor. The dedicated handrail agent verifies these inputs, and if discovered, the request is prohibited. MLFlow picks up the entire flow – including the stimulation of handrails, logic, and the response of the worker – provides full and insight tracking in safety mechanisms.
import mlflow, asyncio
from pydantic import BaseModel
from agents import (
Agent, Runner,
GuardrailFunctionOutput, InputGuardrailTripwireTriggered,
input_guardrail, RunContextWrapper)
from dotenv import load_dotenv
load_dotenv()
mlflow.openai.autolog()
mlflow.set_tracking_uri("./mlruns")
mlflow.set_experiment("Agent‑Guardrails")
class MedicalSymptons(BaseModel):
medical_symptoms: bool
reasoning: str
guardrail_agent = Agent(
name="Guardrail check",
instructions="Check if the user is asking you for medical symptons.",
output_type=MedicalSymptons,
)
@input_guardrail
async def medical_guardrail(
ctx: RunContextWrapper[None], agent: Agent, input
) -> GuardrailFunctionOutput:
result = await Runner.run(guardrail_agent, input, context=ctx.context)
return GuardrailFunctionOutput(
output_info=result.final_output,
tripwire_triggered=result.final_output.medical_symptoms,
)
agent = Agent(
name="Customer support agent",
instructions="You are a customer support agent. You help customers with their questions.",
input_guardrails=[medical_guardrail],
)
async def main():
try:
await Runner.run(agent, "Should I take aspirin if I'm having a headache?")
print("Guardrail didn't trip - this is unexpected")
except InputGuardrailTripwireTriggered:
print("Medical guardrail tripped")
if __name__ == "__main__":
asyncio.run(main())
This script introduces customer support agent using entry competitions discovering the doctor. Gangrail_agen is used separately to assess whether to enter the user has a request for medical advice. If such inputs are discovered, the handrails are released and prevents the main factor from responding. The entire process is registered and tracked, including handrails and results tests, and they are automatically followed using MLFlow.
MLFlow UI
To open the MLFlow user interface and display all registered agent reactions, turn on the next thing in a new station:


In this example, we asked the agent, “Should I take aspirin if I face a headache?” , Which sparked handrails. At MLFlow user interface, we can clearly see that the input was marked, along with the logic of the handrail agent for the reason for the prohibition of the request.
Check the codes. All the credit for this research goes to researchers in this project. Are you ready to communicate with 1 million devs/engineers/researchers? Learn how NVIDIA, LG AI Research and Top Ai Companies Marktechpost benefit to reach its target audience [Learn More] |

I am a graduate of civil engineering (2022) from Islamic Melia, New Delhi, and I have a strong interest in data science, especially nervous networks and their application in various fields.
Don’t miss more hot News like this! Click here to discover the latest in AI news!
2025-07-14 17:50:00