Skip to Content
Confident AI is free to try . No credit card required.
Evals API
Datasets
Data Models

Data Models for Datasets

A dataset is a collection of goldens, which at evaluation time will be used for creating test cases that are ready for evaluation.

Dataset: Collection of goldens, can be multi-turn or single-turn.

Golden: Similar to test cases, represents interactions with your LLM app. However, a golden does not contain the outcome/output of a particular interaction, there is not ready for evaluation. Goldens can either be single or multi-turn.

Similar to test runs, dataset can either be single or multi-turn. This means you cannot add a Golden to a single-turn dataset, and vice versa.

Dataset

dataset.d.ts
type Dataset = { alias: string; goldens?: Golden[]; conversationalGoldens?: []; multiTurn?: boolean; // defaults to false }

The alias of your dataset must be unique for a particular project.

Goldens

Single-Turn

A single-turn golden is represented simply by a Golden:

golden.d.ts
type Golden = { input: string; expectedOutput?: string; context?: string[]; expectedTools?: ToolCall[]; actualOutput?: string; retrievalContext?: string[]; toolsCalled?: ToolCall[]; customColumnKeyValues?: Record<string, string>; }

Multi-Turn

A single-turn golden is represented simply by a ConversationalGolden:

conversational-golden.d.ts
type ConversationalGolden = { scenario: string; expectedOutcome?: string; userDescription?: string; customColumnKeyValues?: Record<string, string>; }
Last updated on