Skip to content

Python: Capture context provider instructions in agent telemetry#6515

Open
eavanvalkenburg wants to merge 3 commits into
microsoft:mainfrom
eavanvalkenburg:fix_instructions_otel
Open

Python: Capture context provider instructions in agent telemetry#6515
eavanvalkenburg wants to merge 3 commits into
microsoft:mainfrom
eavanvalkenburg:fix_instructions_otel

Conversation

@eavanvalkenburg

Copy link
Copy Markdown
Member

Motivation & Context

AgentTelemetryLayer captured gen_ai.system_instructions before context providers ran, so the invoke_agent span only reported base instructions and missed instructions added through ContextProvider.before_run().

Fixes #5291

Description & Review Guide

  • What are the major changes?
    • Track the active invoke_agent span while agent telemetry is running.
    • Let agent-owned chat telemetry update that span with the final system instructions once context providers have contributed instructions.
    • Add regression coverage for non-streaming, streaming, and unrelated nested chat calls.
  • What is the impact of these changes?
    • Agent telemetry now reports the same final system instructions sent to the chat client, including context-provider additions.
    • Unrelated nested chat telemetry is guarded from overwriting the agent span instructions.
  • What do you want reviewers to focus on?
    • The active agent span guard in observability.py, especially the parent/session checks that limit updates to agent-owned chat spans.

Related Issue

Fixes #5291

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
  • This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

Copilot AI review requested due to automatic review settings June 15, 2026 11:06
@github-actions

github-actions Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   observability.py8996792%408, 410–411, 414, 417, 420–421, 426–427, 433–434, 440–441, 448, 450–452, 455–457, 462–463, 469–470, 476–477, 484, 661–662, 861, 865–867, 869, 873–874, 878, 916, 918, 929–931, 933–935, 939, 947, 1071–1072, 1307, 1574–1575, 1809, 1850–1851, 2000, 2306, 2309, 2321, 2338, 2342–2343, 2346, 2352, 2461, 2678, 2680
TOTAL39998451588% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
7979 34 💤 0 ❌ 0 🔥 2m 4s ⏱️

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated Code Review

Reviewers: 5 | Confidence: 89%

✓ Correctness

The PR correctly captures final system instructions (including context provider additions) on the agent span by tracking the active agent span via a ContextVar and updating it from the chat telemetry layer when the chat span is agent-owned. The parent-span guard and session-based heuristic correctly prevent unrelated nested chat calls from overwriting agent instructions. Token resets are placed in all critical paths (exception handlers and finally blocks). No correctness issues found.

✓ Security Reliability

The PR correctly implements context-provider instruction capture in agent telemetry. The parent span guard in _capture_active_agent_system_instructions properly prevents unrelated nested chat calls from overwriting agent span instructions by verifying both span_id and trace_id lineage. Context var lifecycle management (set/reset) is consistent with the existing patterns for INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDS and INNER_ACCUMULATED_USAGE across both streaming and non-streaming paths. Instructions are only captured under the SENSITIVE_DATA_ENABLED guard, and the is_agent_chat_call check using session presence provides an appropriate trust boundary. No security or reliability issues found.

✓ Test Coverage

The PR adds three test scenarios (parametrized streaming/non-streaming) that validate context-provider instructions are captured on the agent span and that unrelated nested chat calls don't overwrite them. The assertions are meaningful and specific. However, the test_agent_instructions_not_overwritten_by_unrelated_nested_chat test only exercises the is_agent_chat_call=False early-return guard because the nested client call doesn't pass a session kwarg. The parent-span-ID comparison logic at lines 2318-2325 of observability.py (which the PR author specifically flagged for reviewer attention) has no test coverage. This is noteworthy because that guard protects against a realistic scenario (e.g., a sub-agent with its own session running inside an outer agent).

✓ Failure Modes

The PR correctly manages the ACTIVE_AGENT_SPAN context variable lifecycle across both streaming and non-streaming agent invocation paths. The streaming path properly resets in both the early-failure except block (line 1818) and the _finalize_stream finally block (line 1860). The non-streaming path uses a clean try/finally at line 1920-1921. The _capture_active_agent_system_instructions function has robust guarding: is_agent_chat_call check, is_recording() check (defends against stale/ended spans), and parent span ID/trace ID validation to prevent unrelated nested chat calls from overwriting. The weakref.finalize GC path only calls _close_span without resetting the context var, but is_recording() returns False after span.end(), preventing any incorrect updates in that edge case.

✗ Design Approach

I found one design issue in the new agent-span update guard: it infers “this chat belongs to the agent’s primary model call” from the presence of a session, which is broader than the framework’s actual session usage and leaves a real overwrite hole for session-aware nested chats launched from context providers.

Flagged Issues

  • observability.py lines 1502 and 2308 treat any session-bearing direct-child chat span as agent-owned. Because ContextProvider.after_run() receives the live session (_agents.py:473-481), a nested chat call from after_run that forwards that session will satisfy is_agent_chat_call and can overwrite the agent span's instructions—contradicting the intent validated by test_agent_instructions_not_overwritten_by_unrelated_nested_chat. An explicit provenance marker on the agent's primary chat invocation is needed rather than inferring ownership from session presence.

Automated review by eavanvalkenburg's agents

@github-actions

Copy link
Copy Markdown
Contributor

Flagged issue

observability.py lines 1502 and 2308 treat any session-bearing direct-child chat span as agent-owned. Because ContextProvider.after_run() receives the live session (_agents.py:473-481), a nested chat call from after_run that forwards that session will satisfy is_agent_chat_call and can overwrite the agent span's instructions—contradicting the intent validated by test_agent_instructions_not_overwritten_by_unrelated_nested_chat. An explicit provenance marker on the agent's primary chat invocation is needed rather than inferring ownership from session presence.


Source: automated DevFlow PR review

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Comment thread python/packages/core/agent_framework/observability.py Outdated
@eavanvalkenburg

Copy link
Copy Markdown
Member Author

Updated based on the review feedback. The session-based ownership heuristic was too broad, so I removed both the explicit active-span tracking and the attempted provenance marker. The updated implementation now uses trace.get_current_span() and only updates the current parent invoke_agent span when the chat instructions preserve/extend the instructions already captured on that agent span. I also strengthened the nested-chat regression to pass the live session from after_run, covering the overwrite case called out by the automated review.

@moonbox3 moonbox3 added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
@eavanvalkenburg eavanvalkenburg force-pushed the fix_instructions_otel branch from 5352372 to 9141d16 Compare June 16, 2026 15:07
eavanvalkenburg and others added 3 commits June 16, 2026 17:13
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@eavanvalkenburg eavanvalkenburg force-pushed the fix_instructions_otel branch from 9141d16 to 67e68cc Compare June 16, 2026 15:14
@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
@eavanvalkenburg eavanvalkenburg added this pull request to the merge queue Jun 16, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Jun 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: AgentTelemetryLayer records gen_ai.system_instructions before ContextProvider.before_run() extends them

4 participants