Prompt Studio Overview

Confident AI’s Prompt Studio allows you to create and version different versions of your prompts. This allows you to:

Collaborate and centralize where prompt is stored and edited, even for non-technical team members
Use it within your LLM application
Pinpoint which version, or even combination of your prompt versions, performs best

There are million places you can keeping your prompts - on GitHub, CSV files, in memory in code, Google Sheets, Notion, or even written in a diary hidden under your table drawer. But only by keeping on prompts on Confident AI can you fully leverage Confident AI’s evaluation features.

Prompts are a type of hyperparameter on Confident AI. Other hyperparameters include things like models, embedders, top-K, and max tokens. By running evaluations using the same prompts that kept on Confident AI, we can tell you which version performs best, and later automatically optimize it for you.

Features Summary

Prompt Versioning Prompt Messages Using Prompts Prompt Testing Prompt Suggestions

Simple Walkthrough

This walkthrough will show how to use the Prompt Studio on Confident AI for prompt management and testing. Not all features will be covered, but if you follow all the steps you’ll have the prompt features setup.

Create Prompt

Go to Prompt Studio in your project space, and create a new prompt. Provide an alias, which is an name unique for your prompt. You must not use the same alias twice in a project space.

Create a Prompt Version

Once you’ve created your first prompt, click on Create new version. The Prompt Editor will pop up, write or paste in your prompt text, as well as any other {{ variables }} that you wish, and click Save.

💡

You can also create a list of prompt messages by clicking on the Prompt Messages tab.

Pull and Evaluate a Prompt Version

Once you have your prompt version, pull it to use it in your LLM app:


from deepeval.dataset import EvaluationDataset
from deepeval.prompt import Prompt
from deepeval.test_case import LLMTestCase
from deepeval import evaluate
 
dataset = EvaluationDataset()
dataset.pull(alias="your-dataset-alias")
 
# Pull prompt version
prompt = Prompt("your-prompt-alias")
prompt.pull()
 
for golden in dataset.goldens:
    # Replace with your own variables
    prompt_to_llm = prompt.interpolate(input=golden.input)
    test_case = LLMTestCase(
        input=golden.input,
        actual_output=your_llm_app(prompt_to_llm) # Replace your_llm_app()
    )
    dataset.test_cases.append(test_case)
 
# Run an evaluation
evaluate(
  test_cases=dataset.test_cases,
  metrics=[...],
  hyperparameters={"My Prompt": prompt}
)

Future Roadmap

Prompt messages
Prompt roles
Create new versions more easily previous versions
Prompt suggestions