Getting Started with LLM Tracing

5 minute quickstart to setup LLM tracing on Confident AI.

Installation

Install DeepEval and setup your tracing enviornment:

Python


pip install deepeval

Don’t forget to login using your API key on Confident AI in the CLI:


deepeval login --confident-api-key YOUR_API_KEY

Or in code:

main.py


import deepeval
 
deepeval.login_with_confident_api_key("YOUR_API_KEY")

If you don’t have an API key, first create an account.

Setup tracing

Python

The @observe decorator is the primary way to instrument your LLM application for tracing.

main.py


from openai import OpenAI
from deepeval.tracing import observe
 
client = OpenAI()
 
@observe()
def llm_app(query: str) -> str:
    return client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": query}
        ]
    ).choices[0].message.content
    return
 
 
# Call app to send trace to Confident AI
llm_app("Write me a poem.")

If your llm_app has more than one function, simply decorate those functions with @observe too.

✅ You just created a trace with a span inside it. Go to the Observatory to see your traces there.

What is a Trace and Span?

Trace: The overall process of tracking and visualizing the execution flow of your LLM application
Span: Individual units of work within your application (e.g., LLM calls, tool executions, retrievals)

Each observed function CREATES A SPAN, and MANY SPANS MAKE UP A TRACE. When you have tracing setup, you can run evaluations on both the trace and span level.

In a later section, you’ll learn how to create spans that are LLM specific, which would allow you to log things such as token cost and model name automatically.

Run an Online Evaluation

You can run online evaluation on both end-to-end (metrics on the trace) and component-level (metrics on the span) evaluations on Confident AI.

First create a metric collection, and add at least one referenceless metric to it. Now in your code, add these lines to automatically run online evals in production:

Python

main.py


from openai import OpenAI
from deepeval.tracing import observe, update_current_span
 
client = OpenAI()
 
@observe(metric_collection=["My Metrics"])
def llm_app(query: str) -> str:
    res = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": query}
        ]
    ).choices[0].message.content
 
    update_current_span(test_case=LLMTestCase(input=query, actual_output=res))
    return res
 
 
# Call app to send trace to Confident AI
llm_app("Write me a poem.")

Congratulations 🎉! Now whenever you run your LLM app, all traces will be logged AND evaluated on Confident AI. To the the Observatory to check it out.

In the next section, we will dive deep into online and offline evaluations for tracing.

Getting Started with LLM Tracing

Installation

Python

JS/TypeScript

Setup tracing

Python

JS/TypeScript

What is a Trace and Span?

Run an Online Evaluation

Python

JS/TypeScript