Using opentracy directly — the Python-first path for new apps
The Python SDK (opentracy) is the native entry point. Use it if you’re
starting a new project or if you want features (auto-routing, distillation,
trace ingestion) that aren’t part of the OpenAI API shape.
One install pulls a platform-specific wheel with the Go engine binary, the
ONNX embedder, and pre-trained routing weights bundled in. No extras
needed for the core path.
Load the pre-trained router once; it picks the right model per prompt:
auto = ot.load_router(cost_weight=0.5)decision = auto.route("Write a haiku about autumn")print(decision.selected_model) # e.g. "ministral-3b-latest"print(decision.cluster_id) # e.g. 87print(decision.expected_error) # e.g. 0.212print(decision.all_scores) # full score dict
Combined with ot.completion this becomes a cost-optimizing client:
The one-call path — ot.distill() runs the full 4-phase pipeline
in-process and returns a callable Student. Needs opentracy[distill]
and a CUDA GPU.
import opentracy as otstudent = ot.distill( dataset="tickets.jsonl", # path, list[dict], or a callable teacher="openai/gpt-4o", student="llama-3.2-1b", steps=100, quantize="q4_k_m", # or None to skip GGUF export)print(student("Classify: refund please")) # local inference, $0# Ship it behind a logical name — app code never changesstudent.deploy("ticket-classifier")resp = ot.completion(model="ticket-classifier", messages=[...])
Full API: ot.distill reference.For the long-running, queued REST flow against a self-hosted engine
(ClickHouse-backed jobs, UI observability), use
Distiller instead — same engine,
different deployment shape.
acompletion shares its request-preparation path with the sync version,
so force_engine, force_direct, fallbacks, and engine-prefix handling
all behave identically.
If you have existing logs from another LLM provider and want to use them
for dataset building or distillation in OpenTracy, you can import them
directly:
from opentracy import add_trace, add_traces, import_traces# Single traceadd_trace({ "prompt": "Classify: ...", "response": "billing", "model": "openai/gpt-4o", "total_cost_usd": 0.00025, "latency_ms": 340, "metadata": {"source": "legacy-log-export"},})# Batchadd_traces([{...}, {...}, {...}])# From a JSONL fileimport_traces("path/to/exported-traces.jsonl")
From that point on, ot.completion(...) routes through the engine.
Per-call overrides:
# Always engine (even if OPENTRACY_ENGINE_URL is unset):ot.completion(..., force_engine=True)# Always direct (even if OPENTRACY_ENGINE_URL is set):ot.completion(..., force_direct=True)
Why isn’t this automatic? Because silently routing through whatever happens
to be listening on localhost:8080 is a footgun. Opt-in is explicit.
Five providers have dedicated classes (OpenAI, Anthropic, Google, Groq,
Mistral); the remaining seven (DeepSeek, Perplexity, Cerebras, Sambanova,
Together, Fireworks, Cohere) route through a UnifiedClient that speaks
the OpenAI-chat protocol. Bedrock is registered but raises a clear error
on construction — AWS SigV4 is not handled by UnifiedClient yet; use
ot.completion(force_engine=True) instead.
Everything import opentracy as ot exposes publicly:
# Coreot.completion, ot.acompletion, ot.Router, ot.ModelResponse, ot.StreamChunk, ot.parse_model# Multi-providerot.create_client, ot.LLMResponse# Pricingot.model_cost, ot.get_model_info, ot.supported_models# Trace ingestionot.add_trace, ot.add_traces, ot.import_traces# Distillation — one-call + REST clientot.distill, ot.DistillError, ot.Student, ot.StudentErrorot.Distiller, ot.TrainingClient, ot.DistillerError# Local alias registry (distilled students map to logical model names)ot.set_alias, ot.unset_alias, ot.list_aliases, ot.get_alias# Versionot.__version__
Lazy research APIs (load_router, UniRouteRouter, RouterEvaluator,
LLMJudge, …) resolve via __getattr__ — they import the first
time you touch them, so they don’t slow down the initial import opentracy.
Legacy code using import lunar_router as lr keeps working via a
backwards-compat shim that redirects to opentracy and emits a
DeprecationWarning. New code should use import opentracy as ot.