Python: [BREAKING] Python: Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs. by moonbox3 · Pull Request #5301 · microsoft/agent-framework

moonbox3 · 2026-04-16T06:28:42Z

Motivation and Context

Important

Breaking changes are scoped to the still-experimental agent-framework-orchestrations package. Core agent-framework workflow code is non-breaking — the new AgentExecutor.emit_data_events kwarg defaults to False (no behavior change), and WorkflowAgent additively consumes data events it previously ignored.

workflow.as_agent().run(prompt) for SequentialBuilder returned the user's input plus every agent's reply instead of the last agent's answer. The same class of bug affected Concurrent, GroupChat, Magentic, and Handoff: they all yielded list[Message] conversation dumps as output events, and intermediate agents emitted output events indistinguishable from the final answer when intermediate_outputs=True.

An ADR (#4799) proposed adding an is_run_completed flag on WorkflowEvent to discriminate final from intermediate outputs. This PR takes a simpler path that uses primitives the framework already has — no new event types or discriminator flags.

Design

Two layers of discrimination, each at the right boundary:

Event layer (workflow consumers): output events carry the workflow's final answer; data events carry intermediate participant activity. Orchestrations restrict output_executors to the terminator, so only the answer surfaces on the output channel. Intermediate participants surface (when intermediate_outputs=True) on the data channel via AgentExecutor(emit_data_events=True), which mirrors yield_output to WorkflowEvent.emit. Workflow consumers branch on event.type.

Content layer (as_agent() consumers): WorkflowAgent rewrites text content from data events as text_reasoning Content blocks before merging into the final AgentResponse. Consumers iterate msg.contents and branch on content.type — the same rendering path they already use for Claude thinking and OpenAI reasoning. No new field on Message/AgentResponse/WorkflowEvent.

Per-orchestration contract

Orchestration	Terminal `output` event
Sequential (agent terminator)	Last agent's `AgentResponse` (streaming chunks — the last `AgentExecutor` itself is the workflow's output executor)
Sequential (custom-executor terminator)	The executor's `list[Message]` (unchanged)
Concurrent (default aggregator)	`AgentResponse` with one assistant message per participant (no user prompt prepended)
Concurrent (custom aggregator)	Whatever the aggregator yields
GroupChat	`AgentResponse` with the orchestrator's completion message
Magentic	`AgentResponse` with the manager's synthesized final answer
Handoff	No terminal yield — per-agent responses already surface as `output` events as agents speak

Behavior changes (orchestrations package)

workflow.as_agent().run(prompt) now returns only the meaningful answer for each orchestration, matching the agent contract.
ConcurrentBuilder default aggregator output changed from list[Message] (user + per-agent) to AgentResponse (per-agent only). Callers consuming output_events[0].data will see an AgentResponse instead of a list.
SequentialBuilder, GroupChat, and Magentic terminal output changed from list[Message] (full conversation) to AgentResponse (answer only).
Handoff no longer emits a synthetic terminal event. The last agent's output event is the end of the stream.
intermediate_outputs=True no longer floods the output channel — intermediate participants surface on the data channel and arrive at as_agent() consumers as text_reasoning content.

Review feedback addressed

Flag renamed emit_intermediate_data → emit_data_events (mechanically accurate; orchestration semantics live at the orchestration layer).
data events are not "re-introducing the removed AgentResponsesEvent" — they're the unified event model's documented intermediate-data channel (WorkflowEvent.emit). The new behavior is the content rewrite at the WorkflowAgent boundary, not a new event class.
Mutation safety: the streaming branch now constructs fresh AgentResponseUpdate instances instead of mutating shared payloads (regression test added).
Concurrent duplication: emit_data_events is suppressed when the default aggregator is in use; only enabled with custom aggregators where intermediate visibility is genuinely separate from the aggregator's transformed answer.
HIL terminator: SequentialBuilder now correctly recognizes AgentApprovalExecutor as a valid terminator and routes its inner output to the parent via allow_direct_output=True.
HIL data forwarding: AgentApprovalExecutor now forwards inner data events through to the parent so wrapped intermediate participants don't silently drop their reasoning.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

…. Align other orchestration outputs

moonbox3 · 2026-04-16T06:31:17Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework/_workflows
_agent.py	380	73	80%	66, 74–80, 116–117, 210, 271, 284, 353, 364, 366, 426, 432, 442–443, 450, 452, 458, 520–521, 530, 544, 570, 572, 605–607, 609, 611, 613, 618, 623, 684, 714, 716, 733, 772–775, 781, 787, 791–792, 795–801, 805–806, 814, 921, 928, 934–935, 946, 978, 985, 1006, 1015, 1019, 1021–1023, 1030
_agent_executor.py	206	17	91%	171, 197, 238, 262, 282–283, 363–365, 367, 377–378, 425, 501–502, 574, 580
_workflow_executor.py	190	31	83%	94, 404, 464, 487, 489, 497–498, 503, 505, 510, 512, 565, 600–606, 610–612, 620, 625, 636, 646, 650, 656, 660, 670, 674
packages/orchestrations/agent_framework_orchestrations
_base_group_chat_orchestrator.py	168	12	92%	109, 277, 292, 326–328, 332, 351, 413, 459–461
_concurrent.py	144	26	81%	51, 60–61, 69–70, 89–90, 95, 113, 118, 123–124, 145, 155, 162, 230, 246, 249, 306, 336, 338–339, 341, 352, 365, 369
_group_chat.py	329	70	78%	173, 336, 343, 372, 383–384, 390, 395, 416, 420, 433–434, 447, 462–463, 465, 481, 508–513, 515, 549–552, 554, 559–563, 651, 654, 693, 696, 699, 702, 710, 722–723, 725–726, 728–729, 731, 736, 739, 748, 754, 798–799, 803–804, 818–819, 821–822, 853–854, 920, 939, 947, 952–954, 961, 971
_handoff.py	335	52	84%	104–105, 107, 160–170, 172, 174, 176, 181, 312, 337, 364, 390, 454, 497, 505, 509–510, 541–543, 548–550, 670, 673, 686, 748, 753, 760, 770, 772, 791, 793, 875–876, 908–909, 1010, 1017, 1089–1090, 1092
_magentic.py	592	91	84%	64–73, 78, 82–93, 258, 269, 273, 293, 354, 363, 365, 407, 424, 433–434, 436–438, 440, 451, 594, 596, 636, 686, 722–724, 726, 736, 744–745, 812–815, 906, 912, 918, 960, 998, 1030, 1047, 1058, 1113–1114, 1118–1120, 1144, 1168–1169, 1182, 1198, 1220, 1265–1266, 1304–1305, 1469, 1472, 1481, 1484, 1489, 1540–1541, 1582–1583, 1631, 1661, 1719, 1733, 1744
_orchestration_request_info.py	57	0	100%
_sequential.py	92	6	93%	52, 164, 175, 181, 228, 263
TOTAL	28351	3294	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
5672	30 💤	0 ❌	0 🔥	1m 32s ⏱️

Copilot

Pull request overview

This PR changes Python orchestration output semantics so workflow.as_agent().run(prompt) returns only the final “answer” (as an AgentResponse) while intermediate participant activity is surfaced via data events (instead of additional output events). It adds a single new knob (AgentExecutor(emit_intermediate_data=...)) and updates orchestrations, tests, and a sample to align with the clarified contract.

Changes:

Add AgentExecutor.emit_intermediate_data to emit parallel data events for AgentResponse / AgentResponseUpdate.
Update Sequential/Concurrent/GroupChat/Magentic/Handoff orchestrations to reserve output for terminal answers and use data for intermediate agent activity.
Update orchestration tests and a sample to validate/illustrate the new terminal vs intermediate event behavior.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
python/samples/03-workflows/agents/sequential_workflow_as_agent.py	Updates sample to reflect `as_agent()` returning only the final agent response and notes intermediate observation via `data`.
python/packages/orchestrations/tests/test_sequential.py	Rewrites sequential tests to assert terminal `output` is last agent only and intermediate activity is `data` when enabled.
python/packages/orchestrations/tests/test_magentic.py	Updates Magentic tests to expect `AgentResponse` terminal outputs and `data` events for intermediate updates.
python/packages/orchestrations/tests/test_handoff.py	Updates Handoff tests to reflect no synthetic terminal output and per-agent outputs as the stream result.
python/packages/orchestrations/tests/test_group_chat.py	Updates GroupChat tests to expect a single terminal `AgentResponse` from the orchestrator and intermediate agent updates as `data`.
python/packages/orchestrations/tests/test_concurrent.py	Updates Concurrent tests to expect default aggregator returns `AgentResponse` with assistant messages only (no user prompt).
python/packages/orchestrations/agent_framework_orchestrations/_sequential.py	Changes sequential wiring so the last agent executor is the workflow output executor; uses `data` events for intermediate participants.
python/packages/orchestrations/agent_framework_orchestrations/_orchestration_request_info.py	Plumbs `emit_intermediate_data` through `AgentApprovalExecutor` into the inner `AgentExecutor`.
python/packages/orchestrations/agent_framework_orchestrations/_magentic.py	Changes Magentic terminal output to `AgentResponse`; adds participant `emit_intermediate_data` plumbing; restricts output executor to orchestrator.
python/packages/orchestrations/agent_framework_orchestrations/_handoff.py	Removes terminal yield; relies on per-agent output events as the observable result.
python/packages/orchestrations/agent_framework_orchestrations/_group_chat.py	Changes GroupChat completion to yield an `AgentResponse`; uses participant `emit_intermediate_data`; restricts outputs to orchestrator.
python/packages/orchestrations/agent_framework_orchestrations/_concurrent.py	Changes default aggregator to yield `AgentResponse` (assistant replies only) and uses participant `emit_intermediate_data` for intermediate observation.
python/packages/orchestrations/agent_framework_orchestrations/_base_group_chat_orchestrator.py	Updates base orchestrator termination/max-round outputs to yield `AgentResponse` completion messages.
python/packages/core/tests/workflow/test_agent_executor.py	Adds core tests asserting `emit_intermediate_data` produces `data` events in both streaming and non-streaming modes.
python/packages/core/agent_framework/_workflows/_agent_executor.py	Implements `emit_intermediate_data` by emitting `WorkflowEvent.emit(...)/type='data'` alongside existing outputs.
python/packages/core/agent_framework/_workflows/_agent.py	Updates `WorkflowAgent` to consume `data` events carrying `AgentResponse` / `AgentResponseUpdate` as part of agent responses.

moonbox3

Automated Code Review

Reviewers: 3 | Confidence: 82%

✓ Correctness

This PR refactors orchestration output from list[Message] to AgentResponse and introduces emit_intermediate_data on AgentExecutor to surface intermediate participants via data events. The core mechanics are sound: output_executors is now always set (not conditioned on intermediate_outputs), intermediate agents emit data events, and the _agent.py adapter layer correctly processes both output and data events. The existing unresolved review comments remain valid (AgentApprovalExecutor as last participant in sequential, sample file stale imports/docstrings, data-event propagation through WorkflowExecutor, and docstring ordering in concurrent aggregator). I found no additional correctness bugs beyond those already raised.

✓ Security Reliability

This PR refactors orchestration outputs from list[Message] to AgentResponse and introduces emit_intermediate_data on AgentExecutor to surface intermediate participants as data events while reserving output events for the terminal answer. The design is clean and the output_executors / data-event split is well-implemented. The key security/reliability issues already identified in previous review comments (AgentApprovalExecutor as terminal participant producing no output, docstring ordering inaccuracy, data event forwarding in WorkflowExecutor) remain unresolved and still apply. One new concern: the handoff workflow removes all terminal yield_output calls, meaning a termination condition that fires before any agent speaks (e.g., from user input alone) will cause the workflow to go idle with zero output events — consumers relying on at least one output would see silent completion.

✓ Test Coverage

The PR introduces a significant refactoring of orchestration output contracts (from list[Message] to AgentResponse) and adds emit_intermediate_data wiring to surface per-agent responses as data events. Test coverage for sequential, group-chat, and magentic workflows is solid, with good new tests for non-streaming, streaming, intermediate outputs, and as_agent scenarios. However, there are notable gaps: ConcurrentBuilder's intermediate_outputs=True path is completely untested despite new wiring, the handoff async-termination test was weakened to a bare IDLE-state check, and sequential intermediate_outputs is only tested in non-streaming mode.

Automated review by moonbox3's agents

…on-outputs

1. Sample cleanup: Remove commented-out FoundryChatClient block and update prerequisites to reference OPENAI_CHAT_MODEL_ID instead of FOUNDRY_* vars. 2. Sequential approval output: Change _EndWithConversation.end_with_agent_executor_response from a no-op sink to yield response.agent_response. When the last participant is AgentApprovalExecutor (via with_request_info), _EndWithConversation is the output executor so the yield produces the terminal answer. When the last participant is a regular AgentExecutor, _EndWithConversation is not in output_executors so the yield is silently filtered out. 3. Forward data events through WorkflowExecutor: _process_workflow_result now also forwards 'data' events from sub-workflows so that emit_intermediate_data=True on AgentExecutor works correctly when wrapped in AgentApprovalExecutor. 4. Concurrent docstring: Update _AggregateAgentConversations docstring to say 'deterministic participant order' instead of 'completion order'. 5. Add test_concurrent_intermediate_outputs_emits_data_events verifying that ConcurrentBuilder(intermediate_outputs=True) emits per-participant data events alongside the single aggregated output event. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…outputs (microsoft#5301) Address PR review comments 2, 3, and 5: - Add test_sequential_request_info_last_participant_emits_output: Verifies that when the last participant is wrapped via with_request_info() (AgentApprovalExecutor), the workflow still emits a terminal output after approval, exercising the _EndWithConversation.end_with_agent_executor_response fallback path. - Add test_sequential_request_info_with_intermediate_outputs_emits_data_events: Verifies that emit_intermediate_data=True works correctly through AgentApprovalExecutor wrapping—WorkflowExecutor._process_result already forwards data events from sub-workflows, so intermediate agent responses surface as data events in the parent workflow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…#5301) Update cast() calls in _group_chat.py and _magentic.py to use WorkflowContext[Never, AgentResponse] instead of the old WorkflowContext[Never, list[Message]], matching the updated method signatures in _base_group_chat_orchestrator.py. Fix _sequential.py _EndWithConversation.end_with_agent_executor_response to declare WorkflowContext[Any, AgentResponse] so yield_output accepts AgentResponse[None]. Fix _workflow_executor.py data event forwarding to handle nullable executor_id. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Extract event.data into a typed local variable before the isinstance check to avoid pyright narrowing it to AgentResponse[Unknown]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

…icrosoft#5301) Add pyright: ignore[reportMissingImports] to orjson imports that are already guarded by try/except ImportError, matching the existing pattern used elsewhere in the samples. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

eavanvalkenburg

small nit on the env var names, otherwise good to go

…on-outputs

Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient in the sequential workflow as agent sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

… reasoning conversion Layered on top of the prior review-feedback work in this branch. Renames: - AgentExecutor.emit_intermediate_data -> emit_data_events (mechanical rename; orchestration semantics live at the orchestration layer, not the general-purpose executor). Forwarded through MagenticAgentExecutor, AgentApprovalExecutor, and all orchestration call sites. - HandoffAgentExecutor._check_terminate_and_yield -> _should_terminate (pure predicate; no longer yields anything). HandoffBuilder docstring rewritten to describe the new per-agent AgentResponse output contract. WorkflowAgent reasoning-content conversion: - Add _rewrite_text_to_reasoning(contents) and _msg_as_reasoning(msg) helpers; the as_agent() path now reframes text content from data events as text_reasoning Content blocks before merging into the AgentResponse. - Consumers iterate msg.contents and branch on content.type — same path they already use for Claude thinking and OpenAI reasoning. No new field on Message/AgentResponse/WorkflowEvent. - Streaming branch constructs fresh AgentResponseUpdate instances instead of mutating shared payloads (regression test added). - Helper _msg_maybe_reasoning consolidates the conditional rewrite at three call sites in the non-streaming conversion. Tests: - TestWorkflowAgentReasoningHelpers + TestWorkflowAgentDataEventReasoningConversion add 9 new tests covering helpers, non-streaming, streaming, mixed content, already-reasoning passthrough, and mutation-safety regression. - Updated test_sequential_as_agent_with_intermediate_outputs_includes_chain to assert text_reasoning content for intermediate agents.

The streaming conversion path narrowed event.data via isinstance against generic AgentResponse, producing AgentResponse[Unknown] and tripping reportUnknownVariableType/reportUnknownMemberType. Binding data: Any before the check keeps runtime behavior identical while restoring a fully known type for downstream access.

Fix orchestration outputs so as_agent() returns the final answer only…

5a5133c

…. Align other orchestration outputs

moonbox3 self-assigned this Apr 16, 2026

Copilot AI review requested due to automatic review settings April 16, 2026 06:28

moonbox3 added the python label Apr 16, 2026

Copilot started reviewing on behalf of moonbox3 April 16, 2026 06:29 View session

Copilot AI reviewed Apr 16, 2026

View reviewed changes

moonbox3 commented Apr 16, 2026

View reviewed changes

Comment thread python/packages/orchestrations/tests/test_concurrent.py

Copilot and others added 7 commits April 16, 2026 08:39

Merge remote-tracking branch 'upstream/main' into improve-orchestrati…

240e307

…on-outputs

Fix pyright reportUnknownVariableType in _agent.py (microsoft#5301)

675afe0

Extract event.data into a typed local variable before the isinstance check to avoid pyright narrowing it to AgentResponse[Unknown]. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Address review feedback for microsoft#5301: review comment fixes

28cf71f

moonbox3 requested review from TaoChenOSU, eavanvalkenburg and giles17 April 16, 2026 10:21

eavanvalkenburg approved these changes Apr 16, 2026

View reviewed changes

Comment thread python/samples/03-workflows/agents/sequential_workflow_as_agent.py Outdated

Comment thread python/samples/03-workflows/agents/sequential_workflow_as_agent.py Outdated

Copilot added 2 commits April 16, 2026 10:46

Merge remote-tracking branch 'upstream/main' into improve-orchestrati…

e3057e1

…on-outputs

Address review feedback for microsoft#5301: review comment fixes

09a12fe

moonbox3 commented Apr 16, 2026

View reviewed changes

Comment thread python/samples/03-workflows/agents/sequential_workflow_as_agent.py

Revert sequential_workflow_as_agent sample to FoundryChatClient

cec1993

Reverts the mistaken switch from FoundryChatClient to OpenAIChatClient in the sequential workflow as agent sample. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

lokitoth added the needs_port_to_dotnet Indicate this item needs to also be done for .Net label Apr 16, 2026

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent_executor.py Outdated

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent.py

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent.py

TaoChenOSU reviewed Apr 16, 2026

View reviewed changes

Comment thread python/packages/orchestrations/agent_framework_orchestrations/_sequential.py

moonbox3 requested a review from TaoChenOSU April 17, 2026 04:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: [BREAKING] Python: Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs.#5301

Python: [BREAKING] Python: Fix orchestration outputs so as_agent() returns the final answer only. Align other orchestration outputs.#5301
moonbox3 wants to merge 13 commits intomicrosoft:mainfrom
moonbox3:improve-orchestration-outputs

moonbox3 commented Apr 16, 2026 •

edited

Loading

Uh oh!

moonbox3 commented Apr 16, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

moonbox3 left a comment

Uh oh!

Uh oh!

eavanvalkenburg left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

moonbox3 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Design

Per-orchestration contract

Behavior changes (orchestrations package)

Review feedback addressed

Contribution Checklist

Uh oh!

moonbox3 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

moonbox3 left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

Uh oh!

Uh oh!

eavanvalkenburg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

moonbox3 commented Apr 16, 2026 •

edited

Loading

moonbox3 commented Apr 16, 2026 •

edited

Loading