OpenAI Integration

Automatic tracing for OpenAI API calls.

Quick Start

Prereq: set TRACIUM_API_KEY (see Installation).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import tracium

# Enable auto-instrumentation
tracium.trace()

# Import clients after enabling tracing
from openai import OpenAI

# Use OpenAI normally - all calls are traced
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)

What Gets Captured

For each OpenAI call, Tracium automatically captures:

Input messages - The full messages array sent to OpenAI
Model - The model being used (gpt-4, gpt-3.5-turbo, etc.)
Output - The response content
Token usage - Input tokens, output tokens, cached tokens
Latency - Time taken for the API call
Errors - Any errors or exceptions

Streaming Support

Streaming responses are fully supported. Tracium captures each chunk and records the complete response when the stream finishes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import tracium

tracium.trace()
from openai import OpenAI
client = OpenAI()

# Streaming is automatically traced
stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Write a poem about coding."}],
    stream=True
)

# Chunks are captured, full response recorded at stream end
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Async Support

Async OpenAI calls are automatically traced:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import tracium
import asyncio

tracium.trace()
from openai import AsyncOpenAI
client = AsyncOpenAI()

async def generate_response(prompt: str) -> str:
    # Async calls are traced automatically
    response = await client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content

# Run async function
result = asyncio.run(generate_response("Hello, world!"))

With Manual Traces

Combine auto-instrumentation with manual traces for full control:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import tracium

client = tracium.init()
from openai import OpenAI
openai = OpenAI()

with client.agent_trace(agent_name="qa-bot") as trace:
    # Add custom context
    with trace.span(span_type="retrieval", name="fetch_context") as span:
        span.record_input({"query": user_query})
        context = get_relevant_context(user_query)
        span.record_output({"context": context})
    
    # OpenAI call is auto-traced as a span within this trace
    response = openai.chat.completions.create(
        model="gpt-4",
        messages=[
            {"role": "system", "content": f"Context: {context}"},
            {"role": "user", "content": user_query}
        ]
    )

Function Calling

Function calls (tools) are captured in the trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
import tracium

tracium.trace()
from openai import OpenAI
client = OpenAI()

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                }
            }
        }
    }
]

# Function calls are captured in the trace
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather in NYC?"}],
    tools=tools,
    tool_choice="auto"
)

Completions API

The legacy Completions API is also supported:

import tracium

tracium.trace()
from openai import OpenAI
client = OpenAI()

# Completions API is traced
response = client.completions.create(
    model="gpt-3.5-turbo-instruct",
    prompt="Write a haiku about programming:",
    max_tokens=50
)

Error Handling

Errors are captured and the span is marked as failed:

import tracium

tracium.trace()
from openai import OpenAI, RateLimitError
client = OpenAI()

try:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello!"}]
    )
except RateLimitError as e:
    # The span is automatically marked as failed with error details
    # including error type, message, and traceback
    handle_rate_limit(e)

← Configuration Anthropic →