AI Agent Observability Strategy for Google ADK

When a Google ADK agent fails in production, you have three minutes to answer "what happened?" before the user moves on or the on-call escalates. Without proper observability, those three minutes turn into three days of print() archaeology.

This page maps the four observability primitives in the official ADK docs — logging, traces, metrics, and callbacks — to TraptureIQ modules.

The four ADK observability primitives

Primitive	What it tells you	ADK reference
Logging	What the agent did, in chronological order	adk docs / logging
Traces	The decision path: which sub-agents, tools, and LLM calls fired, with timing	adk docs / traces
Metrics	Aggregates: error rate, latency p95, token usage	adk docs / metrics
Callbacks	Hook into lifecycle events for custom instrumentation	adk docs / callbacks

ADK emits structured OpenTelemetry GenAI signals by default. TraptureIQ consumes these and presents them in dedicated modules.

Checklist

1. Set logging levels intentionally

In ADK, use:

DEBUG — full prompts and responses (development only — bloats logs and risks logging PII)
INFO — lifecycle events (recommended default for production)
WARNING — only when something is recoverably wrong

View structured logs in TraptureIQ's Logs module. Use the filter sidebar to scope by agent, session, severity, and timestamp.

2. Use traces, not logs, for "which path did it take?"

Logs are a stream of events. Traces are a tree of spans showing causality — which sub-agent invoked which tool which called which LLM. Use Traces for any incident involving multi-step reasoning or sub-agents.

3. Wire lifecycle callbacks for cross-cutting concerns

ADK's callbacks let you hook the before/after of model, tool, and agent invocations. Use them — not the agent's main code — for:

PII redaction before sending to Gemini — before_model_callback
Caching expensive tool results — before_tool_callback
Custom metrics emission — after_model_callback
Authorization gates — before_agent_callback

Code that lives in callbacks stays out of your agent's prompt and decision-making logic.

4. Build a per-agent observability dashboard

For every production agent, pin these in your team's view:

Error rate — should sit under 1 % steady-state
Latency p95 — Gemini Pro typically 2-6 s; Flash 0.5-2 s
Token usage trend — rising trend = compaction not enabled or prompt grew
Session abandonment — % of sessions ending mid-conversation

The Analytics Dashboard and Agent Intelligence modules give you all four out of the box.

5. Drill from analytics → sessions → traces

The investigation flow:

Spot an anomaly in the Analytics dashboard (latency spike, error spike)
Filter Sessions by the same time window and agent → find affected sessions
Open a Trace for one bad session → see exactly which span failed
Open Logs scoped to that session → read the structured payload

This four-click path is the difference between a five-minute incident review and a two-hour one.

Anti-patterns

Logging at DEBUG in production — Floods storage, slows ingest, and risks leaking PII into logs. Use INFO.
print() instead of structured logging — Loses the ability to filter by agent, session, severity.
Relying on traces alone with no metrics — Traces tell you about one request; metrics tell you about your fleet.
No callbacks — Means PII filtering, auth, and metrics live in the agent's main code, polluting prompts and complicating debugging.

Where to configure

Live event stream → Logs
Per-session trace tree → Traces
Conversation playback → Sessions
Fleet-wide aggregates → Analytics
Agent-specific drill-in → Agent Intelligence

The four ADK observability primitives​

Checklist​

1. Set logging levels intentionally​

2. Use traces, not logs, for "which path did it take?"​

3. Wire lifecycle callbacks for cross-cutting concerns​

4. Build a per-agent observability dashboard​

5. Drill from analytics → sessions → traces​

Anti-patterns​

Where to configure​

References​