Creating Custom Metrics via Evals API

All metrics you create and update are by definition custom metrics. Default metrics, which are all DeepEval metrics, are available to be added to any metric collection anytime, and doesn’t require to be created.

Create Metric

Similar to metric collections, your metric name and multiTurn combination MUST BE UNIQUE within a project.

Single-turn

💡

A single-turn custom metric on Confident AI uses DeepEval’s “G-Eval” metric under-the-hood,

The request body should be strictly the an object of "metrics" to a list of Metric data models, but additionally requiring either the criteria and/or evaluationSteps, and evaluationParams:

POST - /v1/metrics


curl -X POST "https://api.confident-ai.com/v1/metrics" \
     -H "Content-Type: application/json" \
     -H "CONFIDENT_API_KEY: <PROJECT-API-KEY>" \
     -d '{          
            "name": "Correctness",
            "criteria": "Determine if the 'actual output' is correct based on the 'expected output'.",
            "evaluationParams": ["actualOutput", "expectedOutput"],
            "multiTurn": false
        }'

response


{
  "success": true,
  "data": {"id": "METRIC-ID"}
}

💡

All custom metrics uses DeepEval’s G-Eval implementation. You should learn more about it to learn what a good criteria, evaluationSteps, and evaluationParams looks like.

Multi-turn

Also known as a “Conversational G-Eval” metric (a multi-turn custom metric offered by DeepEval), the request body should be strictly the an object of "metrics" to a list of Metric data models, but additionally requiring either the criteria and/or evaluationSteps:

POST - /v1/metrics


curl -X POST "https://api.confident-ai.com/v1/metrics" \
     -H "Content-Type: application/json" \
     -H "CONFIDENT_API_KEY: <PROJECT-API-KEY>" \
     -d '{
            "metrics": [
                {            
                    "name": "Relevancy",
                    "criteria": "Determine if the assistant answers are relevant to what the user is asking.",
                    "multiTurn": true
                }
            ]
         }'

response


{
  "success": true,
  "data": {"ids": ["METRIC-ID"]}
}

You don’t have to provide evaluationParams when creating a multi-turn, conversational metric.

Update Metric

The request body should be strictly the Metric data model, but additionally it is required that after updating either the criteria and/or evaluationSteps, and evaluationParams still has to be populated:

PUT - /v1/metrics/:id


curl -X PUT "https://api.confident-ai.com/v1/metrics/:id" \
     -H "Content-Type: application/json" \
     -H "CONFIDENT_API_KEY: <PROJECT-API-KEY>" \
     -d '{
            "criteria": "Determine if the 'actual output' is correct based on the 'input'.",
            "evaluationParams": ["input", "actualOutput"]
         }'

response


{"success": true}

You CANNOT change the multiTurn value of a metric once it has been created.

Get Metric(s)

Finally, you can list all metrics in your product as follows (including default ones):

GET - /v1/metrics


curl -X GET "https://api.confident-ai.com/v1/metrics" \
     -H "CONFIDENT_API_KEY: <PROJECT-API-KEY>"

response


{
    "success": true,
    "data": {
        "metrics": [{
            "id": "METRIC-ID-1",
            "name": "Correctness",
            "criteria": "Determine if the 'actual output' is correct based on the 'input'.",
            "evaluationParams": ["input", "actualOutput"],
            "multiTurn": false,
        },
        {
            "id": "METRIC-ID-2",
            "name": "Relevancy",
            "criteria": "Determine if the assistant answers are relevant to what the user is asking.",
            "multiTurn": true
        }]
    }
}

Batch Create Metrics

You can also batch create metrics by providing the Metric data model through the "metrics" key:

POST - /v1/batch-metrics


curl -X POST "https://api.confident-ai.com/v1/batch-metrics" \
     -H "Content-Type: application/json" \
     -H "CONFIDENT_API_KEY: <PROJECT-API-KEY>" \
     -d '{
            "metrics": [
                {            
                    "name": "Correctness",
                    "criteria": "Determine if the 'actual output' is correct based on the 'expected output'.",
                    "evaluationParams": ["actualOutput", "expectedOutput"],
                    "multiTurn": false
                }
            ]
         }'

response


{
  "success": true,
  "data": {"ids": ["METRIC-ID"]}
}