Auto-Instrumentation

Automatically trace LLM calls without modifying your application code.

How It Works

When you call tracium.trace(), Tracium automatically:

Detects which LLM libraries are installed
Patches their API methods to intercept calls
Creates traces and spans automatically
Captures inputs, outputs, and token usage

Important: Auto-instrumentation patches libraries at import time. Call tracium.trace() before importing and using LLM clients for best results.

Enabling Auto-Instrumentation

1
2
3
4
5
6
7
8
9
10
import tracium

# Option 1: One-line setup (enables all auto-instrumentation)
tracium.trace()

# Now import and use your LLM clients normally
from openai import OpenAI

openai_client = OpenAI()
# All calls are automatically traced!

Supported Libraries

LLM Providers

• OpenAI (Chat, Completions)
• Anthropic (Claude)
• Google Generative AI (Gemini)

Frameworks

• LangChain
• LangGraph

Web Frameworks

• FastAPI / Starlette
• Flask
• Django
• Celery

WSGI

• Generic WSGI middleware
• Any WSGI-compatible server

What Gets Captured

For each LLM call, auto-instrumentation captures:

Data	Description
Input Messages	The prompt/messages sent to the LLM
Output	The response from the LLM
Model ID	Which model was used (e.g., gpt-4, claude-3)
Token Usage	Input tokens, output tokens, cached tokens
Latency	Time taken for the API call
Errors	Any errors or exceptions that occurred

Media capture (audio/images)

By default, Tracium does not capture raw media payloads. To capture audio/image data, set capture_media to true during SDK initialization. This is opt-in for privacy reasons, since media content can include sensitive user data.

See the SDK reference for capture_media details and supported providers.

1
2
3
4
5
6
import tracium

# Enable capturing audio/image data (base64) in spans
tracium.trace(
    capture_media=True,
)

Streaming Support

Auto-instrumentation fully supports streaming responses:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tracium
from openai import OpenAI

tracium.trace()
client = OpenAI()

# Streaming is automatically traced
stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

# Each chunk is captured, and the full response is recorded
# when the stream completes
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Disabling Auto-Instrumentation

You can selectively disable auto-instrumentation for specific libraries:

1
2
3
4
5
6
7
8
9
import tracium

# Disable specific integrations
client = tracium.init(
    api_key="sk_live_...",
    auto_instrument_llm_clients=True,   # OpenAI, Anthropic, Google
    auto_instrument_langchain=False,    # Disable LangChain
    auto_instrument_langgraph=False,    # Disable LangGraph
)

Combining with Manual Tracing

Auto-instrumentation works seamlessly with manual traces. When you create a manual trace, auto-instrumented calls become spans within that trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tracium
from openai import OpenAI

client = tracium.init()
openai = OpenAI()

with client.agent_trace(agent_name="my-agent") as trace:
    # Manual span for your logic
    with trace.span(span_type="plan", name="analyze") as span:
        span.record_input({"query": user_query})
        plan = create_plan(user_query)
        span.record_output({"plan": plan})
    
    # This OpenAI call is auto-traced as a span within "my-agent"
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": plan}]
    )

Context Propagation

Tracium automatically propagates context across:

Threads - ThreadPoolExecutor, Thread
Async - asyncio tasks and coroutines
Web requests - FastAPI, Flask, Django request handlers

This means spans created in child threads or async tasks are automatically linked to their parent trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import asyncio
import tracium
from openai import AsyncOpenAI

tracium.trace()
client = AsyncOpenAI()

async def process_messages(messages: list[str]):
    # All these parallel calls are traced within the same context
    tasks = [
        client.chat.completions.create(
            model="gpt-4",
            messages=[{"role": "user", "content": msg}]
        )
        for msg in messages
    ]
    return await asyncio.gather(*tasks)

Best Practices

Initialize Early

Call tracium.trace() at the start of your application, before importing LLM clients, to ensure all calls are captured.

Use Default Settings

Set default_agent_name and default_tags during initialization to automatically apply them to all traces.

Combine with Manual Spans

Use manual spans to add context around auto-traced LLM calls, such as retrieval steps or business logic.

← Spans Versioning →