This guide covers the platform-based workflow. If you prefer to generate inferences programmatically (e.g., in a CI/CD pipeline or custom script), see the SDK tutorials instead.
Prerequisites
Before you begin, make sure you have the following set up in the Galtea Dashboard:- A Product representing your AI system
- A Test with at least one Test Case to run against your endpoint
Workflow Overview
Create an Endpoint Connection
Define how Galtea should call your AI endpoint — URL, authentication, request format, and response extraction.
Create a Version with the Endpoint Connection
Create a new version of your product and attach the endpoint connection to it.
Run a Test from the Dashboard
Select a test and run it against the version. Galtea calls your endpoint for each test case and records the inference results.
Step 1: Create an Endpoint Connection
Navigate to your product in the Dashboard and go to the Endpoint Connections section. Click New Endpoint Connection and configure the following:- Name — A descriptive name (e.g., “Production Chat API”).
- Type — Select
CONVERSATIONfor the primary request/response endpoint. - URL — The full URL of your AI endpoint (e.g.,
https://api.company.com/v1/chat). - HTTP Method — Typically
POST. - Authentication — Choose the auth type (
Bearer,API_KEY,Basic, orNone) and provide the token. - Input Template — A Jinja2 template that defines the request body Galtea will send.
- Output Mapping — JSONPath expressions that tell Galtea how to extract values from the response.
Input Template
The input template uses Jinja2 syntax with placeholders that Galtea fills automatically. At minimum, use{{ input }} to inject the test case input:
past_turns to include conversation history:
Output Mapping
The output mapping tells Galtea how to extract values from the API response using JSONPath expressions. Theoutput key is required:
output and retrieval_context are saved to the session metadata and become available as {{ key }} placeholders in subsequent turns.
See Version — Special keys in Output Mapping for a complete reference of how extracted values are stored and reused.
Step 2: Create a Version with the Endpoint Connection
Navigate to your product and create a new Version. When configuring the version:- Fill in the version name, model, and any other relevant properties.
- In the Conversation Endpoint Connection field, select the endpoint connection you created in Step 1.
If your AI system requires separate endpoints for session initialization or cleanup, you can optionally configure Initialization and Finalization endpoint connections. See Version — Multi-Step Session Lifecycle for details.
Step 3: Run a Test
Once your version is set up with an endpoint connection, you can run tests directly from the Dashboard:- Navigate to your product’s Tests section.
- Select the test you want to run.
- Choose the version with the configured endpoint connection.
- Start the test run.
Step 4: Evaluate the Results
After the inferences have been generated, you can trigger evaluations:- Navigate to the session results in the Dashboard.
- Select the Metrics you want to use for the evaluation.
- Run the evaluation.
Collecting Traces During Direct Inference
There are two ways to collect traces during Direct Inference:- Output Mapping (no code) — Extract traces from the API response using a
traceskey in your output mapping. - SDK
set_context(in your handler) — Pass{{ inference_result_id }}to your endpoint and use the SDK to create traces from within the handler.
Option 1: Extract Traces via Output Mapping
If your endpoint returns trace data in its response, you can extract it using thetraces key in the output mapping. Galtea will store each trace object linked to the inference result automatically.
Example API response:
traces array and creates Trace entities linked to the inference result. Each object in the array must contain at least a name field and can include any Trace properties:
| Property | Type | Required | Description |
|---|---|---|---|
name | string | Yes | Name of the traced operation |
type | string | No | One of: SPAN, GENERATION, EVENT, AGENT, TOOL, CHAIN, RETRIEVER, EVALUATOR, EMBEDDING, GUARDRAIL |
description | string | No | Human-readable description of the operation |
inputData | object | No | Input parameters passed to the operation |
outputData | object | No | Result returned by the operation |
error | string | No | Error message if the operation failed |
latencyMs | number | No | Execution time in milliseconds |
metadata | object | No | Additional custom metadata |
startTime | string | No | ISO 8601 timestamp when the operation started |
endTime | string | No | ISO 8601 timestamp when the operation completed |
parentTraceId | string | No | ID of the parent trace for hierarchical relationships |
Option 2: Use set_context in Your Endpoint Handler
When running evaluations via Direct Inference, you can collect traces from your endpoint handler by passing the {{ inference_result_id }} placeholder in your input template. This lets your endpoint know which inference result the call belongs to, so it can link traces back to Galtea.
1. Add {{ inference_result_id }} to your Input Template
Include the placeholder in your endpoint connection’s input template so your handler receives the ID:
2. Use set_context in Your Endpoint Handler
In your API endpoint, extract the inference_result_id from the request and use the SDK’s set_context / clear_context to associate traces with it:
@trace-decorated functions called while the context is active will be automatically linked to the inference result in Galtea.
For a complete guide on tracing setup, decorators, and context managers, see the Tracing Agent Operations tutorial.
Learn More
Endpoint Connection
Full reference for configuring endpoint connections
Version
Learn about versions and how endpoint connections integrate with them
Evaluations
Understand how evaluations work
Metrics
Browse available metrics for evaluating your AI
Tracing Agent Operations
Capture and analyze your agent’s internal operations