Creating a Knowledge Graph Using an LLM

0 4 minutes read

In this tutorial, we will show how to create a graph of knowledge from an unorganized document using LLM. While traditional NLP methods were used to extract entities and relationships, large LLMS models such as GPT-4O-MINI make this process more accurate and aware of the context. LLMS is especially useful when working with unorganized chaotic data. Using Python, Mirascope and GPT-4O-MINI from Openai, we will build a simple graphic graph from a sample medical record.

Constance stabilization

!pip install "mirascope[openai]" matplotlib networkx

API Openai key

To get the Openai API key, please visit https://platform.epenai.com/Settes/organization/API- KEYS and create a new key. If you are a new user, you may need to add details of the bills setting and pay the minimum $ 5 to activate the API access. Check the full Symbols here.

import os
from getpass import getpass
os.environ["OPENAI_API_KEY"] = getpass('Enter OpenAI API Key: ')

Determine the chart chart

Before we extract the information, we need a structure to represent it. In this step, we define a simple scheme for our knowledge fees using Pydantic. The plan includes:

The knot: It represents an entity that contains an identifier, a type (such as “doctor” or “medications”), and optional properties.

Edge: It represents a relationship between the two contracts.

Knowledge: a container for all contracts and edges.

Check the full Symbols here.

from pydantic import BaseModel, Field

class Edge(BaseModel):
    source: str
    target: str
    relationship: str

class Node(BaseModel):
    id: str
    type: str
    properties: dict | None = None

class KnowledgeGraph(BaseModel):
    nodes: list[Node]
    edges: list[Edge]

Determine the patient’s record

Now that we have a scheme, let’s define the non -structured data that we will use to create our knowledge graph. Below is a sample of the patient’s record, written in a natural language. It contains major events, symptoms and notes related to a patient named Mary. Check the full Symbols here.

patient_log = """
Mary called for help at 3:45 AM, reporting that she had fallen while going to the bathroom. This marks the second fall incident within a week. She complained of dizziness before the fall.

Earlier in the day, Mary was observed wandering the hallway and appeared confused when asked basic questions. She was unable to recall the names of her medications and asked the same question multiple times.

Mary skipped both lunch and dinner, stating she didn't feel hungry. When the nurse checked her room in the evening, Mary was lying in bed with mild bruising on her left arm and complained of hip pain.

Vital signs taken at 9:00 PM showed slightly elevated blood pressure and a low-grade fever (99.8°F). Nurse also noted increased forgetfulness and possible signs of dehydration.

This behavior is similar to previous episodes reported last month.
"""

Getting the graph of knowledge

To convert the non -structured patient’s records into organized visions, we use a LLM powder function that extracts a graphic fee for knowledge. Each income is analyzed to identify entities (such as people, symptoms, events) and their relationships (such as “the amount”, “symptoms”).

Cenderate_KG function is decorated using @openai.call, where it takes advantage of the GPT-4O-MINI model and pre-defined knowledge scheme. It clearly guides the form of the form on how to set the record to the contract and the edges. Check the full Symbols here.

from mirascope.core import openai, prompt_template

@openai.call(model="gpt-4o-mini", response_model=KnowledgeGraph)
@prompt_template(
    """
    SYSTEM:
    Extract a knowledge graph from this patient log.
    Use Nodes to represent people, symptoms, events, and observations.
    Use Edges to represent relationships like "has symptom", "reported", "noted", etc.

    The log:
    {log_text}

    Example:
    Mary said help, I've fallen.
    Node(id="Mary", type="Patient", properties={{}})
    Node(id="Fall Incident 1", type="Event", properties={{"time": "3:45 AM"}})
    Edge(source="Mary", target="Fall Incident 1", relationship="reported")
    """
)
def generate_kg(log_text: str) -> openai.OpenAIDynamicConfig:
    return {"log_text": log_text}
kg = generate_kg(patient_log)
print(kg)

Inquire about the graph

Once the knowledge is created from the non -structured patient’s record, we can use it to answer medical or behavioral information. We determine the operation of the function () takes a natural question in the structured language and graphs, and pass it to the LLM claim to explain it and respond to it. Check the full Symbols here.

@openai.call(model="gpt-4o-mini")
@prompt_template(
    """
    SYSTEM:
    Use the knowledge graph to answer the user's question.

    Graph:
    {knowledge_graph}

    USER:
    {question}
    """
)
def run(question: str, knowledge_graph: KnowledgeGraph): ...

question = "What health risks or concerns does Mary exhibit based on her recent behavior and vitals?"
print(run(question, kg))

Imagine the graph

Finally, we use Render_GRAPH (KG) to create a clear and interactive visual representation of the graphic drawing of knowledge, which helps us to better understand the patient’s condition and relationships between symptoms observed, behaviors and medical concerns.

import matplotlib.pyplot as plt
import networkx as nx

def render_graph(kg: KnowledgeGraph):
    G = nx.DiGraph()

    for node in kg.nodes:
        G.add_node(node.id, label=node.type, **(node.properties or {}))

    for edge in kg.edges:
        G.add_edge(edge.source, edge.target, label=edge.relationship)

    plt.figure(figsize=(15, 10))
    pos = nx.spring_layout(G)
    nx.draw_networkx_nodes(G, pos, node_size=2000, node_color="lightgreen")
    nx.draw_networkx_edges(G, pos, arrowstyle="->", arrowsize=20)
    nx.draw_networkx_labels(G, pos, font_size=12, font_weight="bold")
    edge_labels = nx.get_edge_attributes(G, "label")
    nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels, font_color="blue")
    plt.title("Healthcare Knowledge Graph", fontsize=15)
    plt.show()

render_graph(kg)

verify Symbols. All the credit for this research goes to researchers in this project. Also, do not hesitate to follow us twitter And do not forget to join 100K+ ML Subreddit And subscribe to Our newsletter.

I am a graduate of civil engineering (2022) from Islamic Melia, New Delhi, and I have a strong interest in data science, especially nervous networks and their application in various fields.