Python: Capture context provider instructions in agent telemetry#6515
Python: Capture context provider instructions in agent telemetry#6515eavanvalkenburg wants to merge 3 commits into
Conversation
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 89%
✓ Correctness
The PR correctly captures final system instructions (including context provider additions) on the agent span by tracking the active agent span via a ContextVar and updating it from the chat telemetry layer when the chat span is agent-owned. The parent-span guard and session-based heuristic correctly prevent unrelated nested chat calls from overwriting agent instructions. Token resets are placed in all critical paths (exception handlers and finally blocks). No correctness issues found.
✓ Security Reliability
The PR correctly implements context-provider instruction capture in agent telemetry. The parent span guard in
_capture_active_agent_system_instructionsproperly prevents unrelated nested chat calls from overwriting agent span instructions by verifying both span_id and trace_id lineage. Context var lifecycle management (set/reset) is consistent with the existing patterns forINNER_RESPONSE_TELEMETRY_CAPTURED_FIELDSandINNER_ACCUMULATED_USAGEacross both streaming and non-streaming paths. Instructions are only captured under theSENSITIVE_DATA_ENABLEDguard, and theis_agent_chat_callcheck using session presence provides an appropriate trust boundary. No security or reliability issues found.
✓ Test Coverage
The PR adds three test scenarios (parametrized streaming/non-streaming) that validate context-provider instructions are captured on the agent span and that unrelated nested chat calls don't overwrite them. The assertions are meaningful and specific. However, the
test_agent_instructions_not_overwritten_by_unrelated_nested_chattest only exercises theis_agent_chat_call=Falseearly-return guard because the nested client call doesn't pass asessionkwarg. The parent-span-ID comparison logic at lines 2318-2325 of observability.py (which the PR author specifically flagged for reviewer attention) has no test coverage. This is noteworthy because that guard protects against a realistic scenario (e.g., a sub-agent with its own session running inside an outer agent).
✓ Failure Modes
The PR correctly manages the ACTIVE_AGENT_SPAN context variable lifecycle across both streaming and non-streaming agent invocation paths. The streaming path properly resets in both the early-failure except block (line 1818) and the _finalize_stream finally block (line 1860). The non-streaming path uses a clean try/finally at line 1920-1921. The _capture_active_agent_system_instructions function has robust guarding: is_agent_chat_call check, is_recording() check (defends against stale/ended spans), and parent span ID/trace ID validation to prevent unrelated nested chat calls from overwriting. The weakref.finalize GC path only calls _close_span without resetting the context var, but is_recording() returns False after span.end(), preventing any incorrect updates in that edge case.
✗ Design Approach
I found one design issue in the new agent-span update guard: it infers “this chat belongs to the agent’s primary model call” from the presence of a session, which is broader than the framework’s actual session usage and leaves a real overwrite hole for session-aware nested chats launched from context providers.
Flagged Issues
- observability.py lines 1502 and 2308 treat any session-bearing direct-child chat span as agent-owned. Because ContextProvider.after_run() receives the live session (_agents.py:473-481), a nested chat call from after_run that forwards that session will satisfy is_agent_chat_call and can overwrite the agent span's instructions—contradicting the intent validated by test_agent_instructions_not_overwritten_by_unrelated_nested_chat. An explicit provenance marker on the agent's primary chat invocation is needed rather than inferring ownership from session presence.
Automated review by eavanvalkenburg's agents
|
Flagged issue observability.py lines 1502 and 2308 treat any session-bearing direct-child chat span as agent-owned. Because ContextProvider.after_run() receives the live session (_agents.py:473-481), a nested chat call from after_run that forwards that session will satisfy is_agent_chat_call and can overwrite the agent span's instructions—contradicting the intent validated by test_agent_instructions_not_overwritten_by_unrelated_nested_chat. An explicit provenance marker on the agent's primary chat invocation is needed rather than inferring ownership from session presence. Source: automated DevFlow PR review |
|
Updated based on the review feedback. The session-based ownership heuristic was too broad, so I removed both the explicit active-span tracking and the attempted provenance marker. The updated implementation now uses |
5352372 to
9141d16
Compare
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
9141d16 to
67e68cc
Compare
Motivation & Context
AgentTelemetryLayercapturedgen_ai.system_instructionsbefore context providers ran, so theinvoke_agentspan only reported base instructions and missed instructions added throughContextProvider.before_run().Fixes #5291
Description & Review Guide
invoke_agentspan while agent telemetry is running.observability.py, especially the parent/session checks that limit updates to agent-owned chat spans.Related Issue
Fixes #5291
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.