Every request your app already makes becomes an OpenTracy trace.
The engine can fan out to 13 providers — keep calling model="gpt-4o",
or switch to model="anthropic/claude-sonnet-4-6" without touching
your auth code.
Routing aliases become usable: later, point model="smart" at a
distilled student without the app knowing.
That’s it. Your app makes the same API calls, gets the same response shape,
but every call is traced in ClickHouse and the engine handles routing /
fallback / retry / cost tracking.
An alias is a logical name you define once in the engine, then call by name:
# In the engine config, alias "smart" → gpt-4o with claude-sonnet-4-6 fallback# Your app:resp = client.chat.completions.create( model="smart", # alias, resolved by the engine messages=[{"role": "user", "content": "..."}],)
Later, when you’ve distilled a student, point "smart" at the student.
The app code doesn’t change — the model upgrade is a config change
on the engine side.
resp.choices[0].message.content # the answerresp.usage.prompt_tokens, completion_tokens# Extras: not in upstream OpenAI responsesresp._cost # USD for this callresp._latency_ms # total latency including providerresp._routing # {"alias": "smart", "selected_model": "gpt-4o", ...}
The extras are under single-underscore names so they don’t collide with
any future OpenAI SDK field.
Streaming works unmodified. The engine translates upstream streaming
formats (Anthropic SSE, Bedrock event-stream, etc.) into OpenAI’s SSE
shape, so your client code doesn’t need per-provider logic:
stream = client.chat.completions.create( model="anthropic/claude-sonnet-4-6", messages=[{"role": "user", "content": "count to 5"}], stream=True,)for chunk in stream: print(chunk.choices[0].delta.content or "", end="", flush=True)
Tool / function calls translate across providers. You pass OpenAI-shaped
tools and tool_choice, and the engine adapts them to Anthropic’s
tools, Gemini’s function declarations, etc.:
The engine has to be reachable from your app. For production:
run the engine in the same VPC / network as your app, or expose it
on a trusted internal hostname. Don’t put the engine on the public
internet without auth.
By default every request is traced with full prompt and response text.
If you handle PII, set OPENTRACY_TRACE_REDACT=true or
OPENTRACY_TRACE_CONTENT=false on the engine — see
Traces → Privacy.