Traces

A trace represents a complete execution flow in your AI application.

What is a Trace?

A trace captures the entire lifecycle of a request or operation in your application. It contains one or more spans that represent individual steps or operations within that request.

Trace: support-botID: abc123
Span: fetch_context (retrieval)
Span: generate_response (llm)
Span: format_output (tool)

Creating a Trace

Use agent_trace as a context manager to create a trace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
import tracium
client = tracium.init()
# Using context manager (recommended)
with client.agent_trace(
agent_name="support-bot",
model_id="gpt-4",
tags=["production", "support"],
metadata={"customer_tier": "premium"}
) as trace:
# Your code here
# All spans created inside are part of this trace
pass

Trace Properties

PropertyTypeDescription
agent_namestrRequired. Identifies your agent/service
model_idstr | NoneLLM model being used (e.g., "gpt-4")
tagslist[str] | NoneTags for filtering and grouping traces
metadatadict | NoneAdditional contextual data
trace_idstr | NoneCustom trace ID (auto-generated if not provided)
versionstr | NoneVersion of your application
lazy_startboolDelay trace creation until first span (default: False)

Trace Handle Methods

The trace handle returned by the context manager provides these methods:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
with client.agent_trace(agent_name="my-agent") as trace:
# Access trace properties
print(f"Trace ID: {trace.id}")
print(f"Agent: {trace.agent_name}")
# Add tags during execution
trace.add_tags(["urgent", "high-priority"])
# Set a summary when done
trace.set_summary({
"total_tokens": 500,
"outcome": "success"
})
# Create spans (covered in Spans documentation)
with trace.span(span_type="llm", name="completion") as span:
pass
# Mark as failed if needed
# trace.mark_failed("Error message")

Automatic Context Propagation

Tracium automatically propagates trace context across threads and async boundaries. This happens transparently when you use ThreadPoolExecutor or async/await:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import tracium
from concurrent.futures import ThreadPoolExecutor
client = tracium.init()
def process_item(item):
# This function has access to the parent trace context
# Any spans created here are part of the parent trace
with tracium.span(span_type="tool", name="process") as span:
span.record_input({"item": item})
result = do_processing(item)
span.record_output({"result": result})
return result
with client.agent_trace(agent_name="batch-processor") as trace:
items = ["a", "b", "c"]
# Context is automatically propagated to threads
with ThreadPoolExecutor(max_workers=3) as executor:
results = list(executor.map(process_item, items))

Lazy Start

Use lazy_start=True to defer trace creation until the first span is recorded. This is useful when you want to conditionally trace based on runtime conditions:

1
2
3
4
5
6
7
8
9
with client.agent_trace(
agent_name="conditional-agent",
lazy_start=True # Trace won't be created until first span
) as trace:
if should_process(request):
# Only now is the trace actually created
with trace.span(span_type="llm") as span:
result = process_request(request)
# If no spans were created, no trace is sent

Error Handling

Traces automatically capture exceptions and mark themselves as failed:

1
2
3
4
5
6
7
8
9
with client.agent_trace(agent_name="error-handler") as trace:
try:
# If an exception occurs, the trace is automatically marked failed
result = risky_operation()
except SpecificError as e:
# You can also manually mark as failed
trace.mark_failed(f"Operation failed: {e}")
# Handle the error
raise

Fail-Safe Design

Important: All Tracium operations are designed to be fail-safe. If tracing fails for any reason, your application code continues to run normally. SDK errors are logged but never propagate to your application.

# Even if Tracium has issues, your code runs normally
with client.agent_trace(agent_name="resilient-agent") as trace:
    # If the API is unreachable, this still executes
    result = your_important_function()
    # Tracing failures are logged but don't break your app