LlamaIndex

LlamaIndex is a framework that makes it easy to build production agents that can find information, synthesize insights, generate reports, and take actions over the most complex enterprise data.

Quickstart

Confident AI provides a LlamaIndexSpanHandler that can be used to trace LlamaIndex spans.

Install the following packages:


pip install -U deepeval llama-index

main.py


import llama_index.core.instrumentation as instrument
from deepeval.integrations.llama_index import instrument_llama_index
import deepeval
 
# Login
deepeval.login("<your-confident-api-key>")
 
# Let DeepEval collect traces
instrument_llama_index(instrument.get_dispatcher())

Now whenever you use LlamaIndex, DeepEval will collect LlamaIndex spans as traces and publish them to Confident AI.

Span in LlamaIndex represents the execution flow of a particular part in the application’s code.

Example Agent

The example agent will be able to perform calculations using the multiply tool.

main.py


import os
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI
import llama_index.core.instrumentation as instrument
 
from deepeval.integrations.llama_index import instrument_llama_index
import deepeval
 
# Don't forget to setup tracing
deepeval.login("<your-confident-api-key>")
instrument_llama_index(instrument.get_dispatcher())
 
os.environ["OPENAI_API_KEY"] = "<your-openai-api-key>"
 
def multiply(a: float, b: float) -> float:
    """Useful for multiplying two numbers."""
    return a * b
 
agent = FunctionAgent(
    tools=[multiply],
    llm=OpenAI(model="gpt-4o-mini"),
    system_prompt="You are a helpful assistant that can perform calculations.",
)

Finally, try calling your agent to see your traces on Confident AI:

main.py


import asyncio
import time
...
 
async def main():   
    response = await agent.run("What's 7 * 8?")
    print(response)
 
if __name__ == "__main__":
    asyncio.run(main())


python main.py

You can directly view the traces in the Observatory by clicking on the link in the output printed in the console.

💡

As the agent is running, it will export spans to the dispatcher. The dispatcher will send the spans to the LlamaIndexSpanHandler which will collect them to create a trace and send it to the Observatory.

Advanced Usage

Online evals

Confident AI supports online evaluation on your LlamaIndex FunctionAgent, ReActAgent and CodeActAgent.

Replace your Workflow agent with DeepEval’s Agent wrapper
Provide metric_collection as an argument
Run your agent

The Workflow agent wrapper auto-populates the input, actual_output, and tools_called fields of LLMTestCase for each Agent. Therefore, you can’t include metrics that require other parameters (e.g. retrieval_context) when creating your metric collection .

main.py


from deepeval.integrations.llama_index import FunctionAgent
 
...
 
agent = FunctionAgent(
    tools=[multiply],
    llm=OpenAI(model="gpt-4o-mini"),
    system_prompt="You are a helpful assistant that can perform calculations.",
    metric_collection="test_collection_1",
)
 
async def main():   
    response = await agent.run("What's 7 * 8?")
    print(response)
 
if __name__ == "__main__":
    asyncio.run(main())