Evaluation Parameters
To compute theuser_satisfaction metric, the following parameter is required:
input: The user messages sent to the chatbot.actual_output: The chatbot’s corresponding responses.
How Is It Calculated?
Theuser_satisfaction score is derived using an LLM-as-a-judge approach with explicit pass criteria:
- Efficiency Check: Was the interaction smooth and direct, or did the user have to rephrase, repeat, or correct the chatbot?
- Sentiment Analysis: Did the user display positive/neutral sentiment or negative sentiment?
- 1 (Satisfied): The interaction was efficient and the user’s sentiment was neutral-to-positive.
- 0 (Not Satisfied): The interaction was inefficient, the user expressed frustration, or both.
Suggested Test Case Types
The User Satisfaction metric is effective for evaluating Behavior test cases in Galtea, particularly:- Customer-facing conversations where user experience quality is a priority.
- Support interactions where efficiency and empathy both matter.
- Comparative evaluations to measure experience improvements across product versions.