GET /v1/models
Returns every model the engine knows about, with pricing and per-model routing stats.Response
| Field | Type | Meaning |
|---|---|---|
models[].model_id | string | Canonical ID. Pair with a provider prefix for completions. |
models[].cost_per_1k_tokens | float | USD per 1,000 tokens (blended input+output; see pricing tables). |
models[].num_clusters | int | Clusters this model has an error profile for. |
models[].overall_accuracy | float | Average accuracy across all profiled clusters. |
default_model | string | Model used when "model" is "auto" and the router is unset. |
Curl
GET /health
Used by load balancers and healthchecks.Response (:8080 — gateway)
Response (:8000 — management API)
status is one of healthy, degraded (router loaded but embedder
down, for example), or unhealthy. Treat anything other than healthy
as “don’t route new traffic”.
