Introduction to LLM Tracing on Confident AI

Confident AI offers LLM tracing for teams to trace and monitor LLM applications. Think Datadog for LLM apps, but with an additional suite of 30+ evaluation metrics to track continuous performance over time.

Why Use LLM Tracing & Observability on Confident AI?

Native to DeepEval, the most widely used LLM evaluation framework in the world.
Tracing is evals-first, you can trace and evaluate literally any component (retrievers, LLMs, tools, agents).
Only platform where you can:
- Leverage DeepEval’s 50+ metrics.
- Run evaluations on:
  - Individual spans (component-level)
  - Traces (end-to-end)
  - Threads (conversation evals)
- Access unlimited evaluation use cases for chatbots, text-to-SQL, RAG pipelines, agentic workflows, document Q&A, summarization, code generation, translation, content moderation, and more.

Traces

You can log a single-turn LLM interaction through LLM tracing. Traces are a single execution of your LLM app, and running evals on traces is akin to the end-to-end evals for LLM evaluation in development.

Loading video...

LLM Tracing: Traces with Evals

0 views • 0 days ago

Confident AI

100K subscribers

Threads (conversations)

Threads are a group of traces, grouped together via a thread_id that you provide during tracing. It represents a multi-turn LLM interaction, and running evals on threads require multi-turn metrics, akin to running evals on ConversationalTestCases in development.

Loading video...

LLM Tracing: Threads with Evals

0 views • 0 days ago

Confident AI

100K subscribers

Get Started

Get LLM tracing for your LLM app with best in-class-evals.

5 Min Quickstart Online & Offline Evals Latency and Cost Tracking

Advanced Features

You can configure tracing on Confident AI in virtually any way you wish:

Trace Environments Sampling Rate Any Metadata Tags Threads/Conversations Mask PII User Tracking Custom Span Types

FAQs

What evals are offered by Confident AI LLM tracing?

You can run evaluations using metrics for RAG, agents, chatbots, on:

Traces (end-to-end)
Spans (individual components)
Threads (multi-turn conversations)

And these are be either done in an online fashion (run evals as they are being ingested in the platform), or offline (run evals retrospectively).

How will tracing affect my app?

Confident AI tracing is designed to be completely non-intrusive to your application. It:

Can be disabled/enabled anytime through the CONFIDENT_TRACING_ENABLED="YES"/"NO" enviornment variable.
Requires no rewrite of your existing code - just add the @observe decorator.
Runs asynchronously in the background with zero impact on latency.
Fails silently if there are any issues, ensuring your app keeps running.
Works with any function signature - you can set input/output at runtime.