Auto-Instrumentation

Automatically trace LLM calls without modifying your application code.

How It Works

When you call tracium.trace() or tracium.init(), Tracium automatically:

  1. Detects which LLM libraries are installed
  2. Patches their API methods to intercept calls
  3. Creates traces and spans automatically
  4. Captures inputs, outputs, and token usage

Important: Auto-instrumentation patches libraries at import time. Call tracium.trace() before importing and using LLM clients for best results.

Enabling Auto-Instrumentation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tracium
# Option 1: One-line setup (enables all auto-instrumentation)
tracium.trace()
# Option 2: With explicit configuration
client = tracium.init(
api_key="sk_live_...",
auto_instrument_llm_clients=True, # OpenAI, Anthropic, Google
auto_instrument_langchain=True,
auto_instrument_langgraph=True,
)
# Now import and use your LLM clients normally
from openai import OpenAI
openai_client = OpenAI()
# All calls are automatically traced!

Supported Libraries

LLM Providers

  • • OpenAI (Chat, Completions)
  • • Anthropic (Claude)
  • • Google Generative AI (Gemini)

Frameworks

  • • LangChain
  • • LangGraph

Web Frameworks

  • • FastAPI / Starlette
  • • Flask
  • • Django
  • • Celery

WSGI

  • • Generic WSGI middleware
  • • Any WSGI-compatible server

What Gets Captured

For each LLM call, auto-instrumentation captures:

DataDescription
Input MessagesThe prompt/messages sent to the LLM
OutputThe response from the LLM
Model IDWhich model was used (e.g., gpt-4, claude-3)
Token UsageInput tokens, output tokens, cached tokens
LatencyTime taken for the API call
ErrorsAny errors or exceptions that occurred

Streaming Support

Auto-instrumentation fully supports streaming responses:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tracium
from openai import OpenAI
tracium.trace()
client = OpenAI()
# Streaming is automatically traced
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
stream=True
)
# Each chunk is captured, and the full response is recorded
# when the stream completes
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")

Disabling Auto-Instrumentation

You can selectively disable auto-instrumentation for specific libraries:

1
2
3
4
5
6
7
8
9
import tracium
# Disable specific integrations
client = tracium.init(
api_key="sk_live_...",
auto_instrument_llm_clients=True, # OpenAI, Anthropic, Google
auto_instrument_langchain=False, # Disable LangChain
auto_instrument_langgraph=False, # Disable LangGraph
)

Combining with Manual Tracing

Auto-instrumentation works seamlessly with manual traces. When you create a manual trace, auto-instrumented calls become spans within that trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tracium
from openai import OpenAI
client = tracium.init()
openai = OpenAI()
with client.agent_trace(agent_name="my-agent") as trace:
# Manual span for your logic
with trace.span(span_type="plan", name="analyze") as span:
span.record_input({"query": user_query})
plan = create_plan(user_query)
span.record_output({"plan": plan})
# This OpenAI call is auto-traced as a span within "my-agent"
response = openai.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": plan}]
)

Context Propagation

Tracium automatically propagates context across:

  • Threads - ThreadPoolExecutor, Thread
  • Async - asyncio tasks and coroutines
  • Web requests - FastAPI, Flask, Django request handlers

This means spans created in child threads or async tasks are automatically linked to their parent trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import asyncio
import tracium
from openai import AsyncOpenAI
tracium.trace()
client = AsyncOpenAI()
async def process_messages(messages: list[str]):
# All these parallel calls are traced within the same context
tasks = [
client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": msg}]
)
for msg in messages
]
return await asyncio.gather(*tasks)

Best Practices

Initialize Early

Call tracium.trace() at the start of your application, before importing LLM clients, to ensure all calls are captured.

Use Default Settings

Set default_agent_name and default_tags during initialization to automatically apply them to all traces.

Combine with Manual Spans

Use manual spans to add context around auto-traced LLM calls, such as retrieval steps or business logic.