@trace decorator to capture every operation your agent performs. Once captured, traces become an evaluation parameter called traces — the full execution path from user input to system output.
Galtea provides built-in metrics like Tool Correctness, but agentic systems are often highly specific. We recommend creating custom judge metrics that check the exact behaviors you care about. Below are practical examples to get you started.
Example 1: Was Tool X called before Tool Y
Example of Judge Prompt:Example 2: Were all attributes collected before calling Tool X?
Example of Judge Prompt:Example 3: Do attributes come from the correct sources?
Example of Judge Prompt:Example 4: Are tools used under required conditions?
Example of Judge Prompt:Example 5: Are the tool outputs actually used?
Ensure tool outputs are actually used. Example of Judge Prompt:Next Steps
Tracing Agent Operations
Capture traces from your agent — required for these metrics to work.
Create Your Own Judge Prompt
Write and refine custom judge prompts for any evaluation criteria.