Skip to main content

Trajectory Evaluators

Trajectory Evaluators in LangChain provide a more holistic approach to evaluating an agent. These evaluators assess the full sequence of actions taken by an agent and their corresponding responses, which we refer to as the "trajectory". This allows you to better measure an agent's effectiveness and capabilities.

A Trajectory Evaluator implements the AgentTrajectoryEvaluator interface, which requires method:

  • evaluateAgentTrajectory: This method evaluates an agent's trajectory.

The methods accept three main parameters:

  • input: The initial input given to the agent.
  • prediction: The final predicted response from the agent.
  • agentTrajectory: The intermediate steps taken by the agent, given as a list of tuples.

These methods return a dictionary. It is recommended that custom implementations return a score (a float indicating the effectiveness of the agent) and reasoning (a string explaining the reasoning behind the score).

You can capture an agent's trajectory by initializing the agent with the returnIntermediateSteps=True parameter. This lets you collect all intermediate steps without relying on special callbacks.

For a deeper dive into the implementation and use of Trajectory Evaluators, refer to the sections below.