Skip to Content
Confident AI is free to try . No credit card required.
Docs
LLM Evaluation
Introduction

Introduction to LLM Evaluation on Confident AI

Confident AI’s evaluation features are second-to-none and 100% integrated and 100% integrated with DeepEval. All the features you’ve seen up to this point in the documentation leads up to the LLM evaluation suite.

What does Confident AI’s LLM Evaluation offer?

We offer evals that can evaluate your LLM app at the:

  • End-to-end level
  • Component-level

Including support for:

  • Single-turn
  • Multi-turn, and
  • Multimodal (text and images)

And offer 50+ metrics that are:

  • Default, battle-tested, open-source, and plug-and-play
  • Custom, research-backed, and easy to create in natrual language
  • For any use case, LLM system architecture, or framework
  • Powered by DeepEval

You can also evaluations in both CI/CD environments, as a separate Python script, on traces in production, or through APIs.

💡

You can also run evals on an individual component-level, or end-to-end should you wish to treat your LLM app as a black-box.

Testing reports are automatically generated each time you run an evaluation using DeepEval or on the UI.

Single-turn evals test single, atomic LLM interactions at either an end-to-end, or component-level via LLM tracing.

Loading video...

Single-Turn Testing Reports

0 views • 0 days ago
Confident AI Logo
Confident AI
100K subscribers
0

Get Started

Get LLM evals for your LLM app, powered by DeepEval.

Advanced Features

Confident AI LLM evals goes beyond single-turn end-to-end evaluations:

FAQs

I already use DeepEval, how long will setup take?

If you’re already using DeepEval, setting up Confident AI will take less than one minute. All you have to do is create an account and login via the CLI using the API key available to you on the platform.

deepeval login --confident-api-key YOUR_API_KEY

What about online evals in production?

You can run online evals to get performance-over-time graphs by setting up LLM tracing and enabling evaluations in code.

Can I run evals on the cloud instead of locally?

Certaintly, checkout this section to learn how to start creating metrics on the cloud without needing to code.

Last updated on