Introducing OpenTracy: Automated Distillation for Production LLMs

OpenTracy Team/January 15, 2024/

announcementproduct

Today we're launching OpenTracy, a platform that automatically creates Small Language Models from your production traces, cutting inference costs by up to 57%.

The Problem

Running LLMs in production is expensive. Most teams start with GPT-4 or Claude for quality, then struggle to optimize costs as they scale. The options are limited:

Prompt engineering: Limited gains, lots of trial and error

Caching: Only helps with exact matches

Cheaper models: Quality drops significantly

Our Solution

OpenTracy takes a different approach. We analyze your production traces—the actual inputs and outputs from your LLM calls—and use them to train a smaller, specialized model that handles your specific use case.

How It Works

Connect your traces: Point OpenTracy at your production logs

Automated curation: We filter and prepare high-quality training data

Distillation: Train a small model on your specific domain

Evaluation: Comprehensive testing against your success criteria

Deployment: One-click deploy to your infrastructure

Results

In our beta, customers saw:

57% average cost reduction

Sub-100ms latency (down from 2-3 seconds)

95%+ quality retention on domain-specific tasks

Get Started

OpenTracy is available today. Sign up for free at opentracy.dev and start cutting your LLM costs.