Auto-Instrumentation
Automatically trace LLM calls without modifying your application code.
How It Works
When you call tracium.trace(), Tracium automatically:
- Detects which LLM libraries are installed
- Patches their API methods to intercept calls
- Creates traces and spans automatically
- Captures inputs, outputs, and token usage
Important: Auto-instrumentation patches libraries at import time. Call tracium.trace() before importing and using LLM clients for best results.
Enabling Auto-Instrumentation
12345678910import tracium
# Option 1: One-line setup (enables all auto-instrumentation)tracium.trace()
# Now import and use your LLM clients normallyfrom openai import OpenAI
openai_client = OpenAI()# All calls are automatically traced!Supported Libraries
LLM Providers
- • OpenAI (Chat, Completions)
- • Anthropic (Claude)
- • Google Generative AI (Gemini)
Frameworks
- • LangChain
- • LangGraph
Web Frameworks
- • FastAPI / Starlette
- • Flask
- • Django
- • Celery
WSGI
- • Generic WSGI middleware
- • Any WSGI-compatible server
What Gets Captured
For each LLM call, auto-instrumentation captures:
| Data | Description |
|---|---|
| Input Messages | The prompt/messages sent to the LLM |
| Output | The response from the LLM |
| Model ID | Which model was used (e.g., gpt-4, claude-3) |
| Token Usage | Input tokens, output tokens, cached tokens |
| Latency | Time taken for the API call |
| Errors | Any errors or exceptions that occurred |
Media capture (audio/images)
By default, Tracium does not capture raw media payloads. To capture audio/image data, set capture_media to true during SDK initialization. This is opt-in for privacy reasons, since media content can include sensitive user data.
See the SDK reference for capture_media details and supported providers.
123456import tracium
# Enable capturing audio/image data (base64) in spanstracium.trace( capture_media=True,)Streaming Support
Auto-instrumentation fully supports streaming responses:
123456789101112131415161718import traciumfrom openai import OpenAI
tracium.trace()client = OpenAI()
# Streaming is automatically tracedstream = client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": "Hello!"}], stream=True)
# Each chunk is captured, and the full response is recorded# when the stream completesfor chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")Disabling Auto-Instrumentation
You can selectively disable auto-instrumentation for specific libraries:
123456789import tracium
# Disable specific integrationsclient = tracium.init( api_key="sk_live_...", auto_instrument_llm_clients=True, # OpenAI, Anthropic, Google auto_instrument_langchain=False, # Disable LangChain auto_instrument_langgraph=False, # Disable LangGraph)Combining with Manual Tracing
Auto-instrumentation works seamlessly with manual traces. When you create a manual trace, auto-instrumented calls become spans within that trace:
123456789101112131415161718import traciumfrom openai import OpenAI
client = tracium.init()openai = OpenAI()
with client.agent_trace(agent_name="my-agent") as trace: # Manual span for your logic with trace.span(span_type="plan", name="analyze") as span: span.record_input({"query": user_query}) plan = create_plan(user_query) span.record_output({"plan": plan}) # This OpenAI call is auto-traced as a span within "my-agent" response = openai.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": plan}] )Context Propagation
Tracium automatically propagates context across:
- Threads - ThreadPoolExecutor, Thread
- Async - asyncio tasks and coroutines
- Web requests - FastAPI, Flask, Django request handlers
This means spans created in child threads or async tasks are automatically linked to their parent trace:
1234567891011121314151617import asyncioimport traciumfrom openai import AsyncOpenAI
tracium.trace()client = AsyncOpenAI()
async def process_messages(messages: list[str]): # All these parallel calls are traced within the same context tasks = [ client.chat.completions.create( model="gpt-4", messages=[{"role": "user", "content": msg}] ) for msg in messages ] return await asyncio.gather(*tasks)Best Practices
Initialize Early
Call tracium.trace() at the start of your application, before importing LLM clients, to ensure all calls are captured.
Use Default Settings
Set default_agent_name and default_tags during initialization to automatically apply them to all traces.
Combine with Manual Spans
Use manual spans to add context around auto-traced LLM calls, such as retrieval steps or business logic.