Getting Started with LLM Tracing
5 minute quickstart to setup LLM tracing on Confident AI.
Installation
Install DeepEval and setup your tracing enviornment:
Python
pip install deepeval
Don’t forget to login using your API key on Confident AI in the CLI:
deepeval login --confident-api-key YOUR_API_KEY
Or in code:
import deepeval
deepeval.login_with_confident_api_key("YOUR_API_KEY")
If you don’t have an API key, first create an account.
Setup tracing
Python
The @observe
decorator is the primary way to instrument your LLM application for tracing.
from openai import OpenAI
from deepeval.tracing import observe
client = OpenAI()
@observe()
def llm_app(query: str) -> str:
return client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": query}
]
).choices[0].message.content
return
# Call app to send trace to Confident AI
llm_app("Write me a poem.")
If your llm_app
has more than one function, simply decorate those functions with @observe
too.
✅ You just created a trace with a span inside it. Go to the Observatory to see your traces there.
What is a Trace and Span?
- Trace: The overall process of tracking and visualizing the execution flow of your LLM application
- Span: Individual units of work within your application (e.g., LLM calls, tool executions, retrievals)
Each observed function CREATES A SPAN, and MANY SPANS MAKE UP A TRACE. When you have tracing setup, you can run evaluations on both the trace and span level.
In a later section, you’ll learn how to create spans that are LLM specific, which would allow you to log things such as token cost and model name automatically.
Run an Online Evaluation
You can run online evaluation on both end-to-end (metrics on the trace) and component-level (metrics on the span) evaluations on Confident AI.
First create a metric collection, and add at least one referenceless metric to it. Now in your code, add these lines to automatically run online evals in production:
Python
from openai import OpenAI
from deepeval.tracing import observe, update_current_span
client = OpenAI()
@observe(metric_collection=["My Metrics"])
def llm_app(query: str) -> str:
res = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "user", "content": query}
]
).choices[0].message.content
update_current_span(test_case=LLMTestCase(input=query, actual_output=res))
return res
# Call app to send trace to Confident AI
llm_app("Write me a poem.")
Congratulations 🎉! Now whenever you run your LLM app, all traces will be logged AND evaluated on Confident AI. To the the Observatory to check it out.
In the next section, we will dive deep into online and offline evaluations for tracing.