Get human annotation count and optimization eligibility
Returns the optimization-eligibility snapshot for a metric.
count is the number of evaluations that have both an AI score and a
human-provided score (annotation). For FULL_PROMPT/PARTIAL_PROMPT metrics
this means the AI judged the evaluation and a human later reviewed it.
For HUMAN_EVALUATION metrics both fields hold the same human-provided
value (for analytics compatibility).
disagreementCount is the subset of count where the human verdict flips
the AI’s verdict — both are binarized at 0.5 (any score >= 0.5 is class 1,
anything below is class 0) and the row counts only when the two classes
differ. These are the rows the optimizer learns from.
minAnnotations and minDisagreements are the api-enforced thresholds
(mirrored from the metrics-generator hard thresholds). Optimize is rejected
early unless count >= minAnnotations AND disagreementCount >= minDisagreements.
Authorizations
API key authorization. Pass your API key in the Authorization header as a Bearer token. Both new (gsk_*) and legacy (gsk-) API keys are accepted, e.g. Authorization: Bearer gsk_... or Authorization: Bearer gsk-....
Path Parameters
Metric ID
Response
Annotation/disagreement counts and thresholds retrieved successfully
Evaluations with both an AI score and a human score.
Subset of count where the human verdict flips the AI's (AI and human on opposite sides of 0.5).
Minimum annotations required to optimize this metric.
Minimum disagreements required to optimize this metric.