fix(opencode): ensure tool-call/tool-result pairing after normalizeMessages#22001
fix(opencode): ensure tool-call/tool-result pairing after normalizeMessages#22001bvironn wants to merge 1 commit intoanomalyco:devfrom
Conversation
|
The following comment was made by an LLM, it may be inaccurate: Great! I found several related PRs. Let me identify the most relevant ones: Related PRs Found
These are not exact duplicates of PR #22001, but they address related problems in the tool-call/tool-result pairing system. PR #22001 appears to be a comprehensive fix that consolidates lessons from these previous issues. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
There was a problem hiding this comment.
Pull request overview
This PR adds a post-normalization safety check to prevent Anthropic-style tool-call history corruption by ensuring every tool-call has a corresponding tool-result, even when upstream transforms drop/alter messages.
Changes:
- Added
ProviderTransform.ensureToolIntegrity()to detect orphanedtool-calls and inject synthetic errortool-results. - Wired
ensureToolIntegrity()intoProviderTransform.message()afternormalizeMessages()for Anthropic/Bedrock SDKs. - Expanded/updated provider transform tests to cover multiple orphan scenarios and the full normalize→repair pipeline.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/opencode/src/provider/transform.ts | Introduces ensureToolIntegrity() and applies it after normalizeMessages() for specific providers. |
| packages/opencode/test/provider/transform.test.ts | Updates an existing Anthropic normalization test expectation and adds a new test suite covering integrity repair cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export function message(msgs: ModelMessage[], model: Provider.Model, options: Record<string, unknown>) { | ||
| msgs = unsupportedParts(msgs, model) | ||
| msgs = normalizeMessages(msgs, model, options) | ||
| if (model.api.npm === "@ai-sdk/anthropic" || model.api.npm === "@ai-sdk/amazon-bedrock") { |
There was a problem hiding this comment.
The tool integrity repair is only enabled for models using @ai-sdk/anthropic and @ai-sdk/amazon-bedrock. This omits the Vertex Anthropic provider (@ai-sdk/google-vertex/anthropic / providerID google-vertex-anthropic), which is treated as Anthropic elsewhere in this module and likely has the same tool-call/tool-result pairing requirement. Consider including that provider in the condition so Vertex Anthropic sessions can’t be corrupted the same way.
| if (model.api.npm === "@ai-sdk/anthropic" || model.api.npm === "@ai-sdk/amazon-bedrock") { | |
| if ( | |
| model.api.npm === "@ai-sdk/anthropic" || | |
| model.api.npm === "@ai-sdk/amazon-bedrock" || | |
| model.providerID === "google-vertex-anthropic" | |
| ) { |
There was a problem hiding this comment.
Already covered — the condition at line 332 includes model.providerID === "google-vertex-anthropic" alongside the npm checks for @ai-sdk/anthropic and @ai-sdk/amazon-bedrock. This matches the same triple-check pattern used by applyCaching below (lines 336-347).
The npm package for Vertex Anthropic is @ai-sdk/google-vertex/anthropic (a subpath export), so checking model.api.npm alone would miss it — that's why the providerID fallback is there.
a7b2dae to
8521eb6
Compare
CI Note: E2E timeouts are unrelated to this changeBoth This PR only modifies the message transform pipeline ( Happy to re-trigger if needed, but this appears to be a runner infrastructure issue rather than a regression. |
|
Here's a corrupted session: https://gist.github.com/bvironn/b513c1eafae54291890355ddc7fd1941 Claude Opus 4.6, no interruption — the corruption happens on its own during normal tool execution. The agent made 5 tool calls across 2 steps (mem_search, glob, read), all completed successfully, but 2 tool_results were missing from the ModelMessage[] by the time the next request was sent. |
Issue for this PR
Closes #21326
Type of change
What does this PR do?
Adds
ensureToolIntegrity()in the provider transform pipeline, running afternormalizeMessagesfor Anthropic/Bedrock/Vertex-Anthropic. It detects orphanedtool-callparts (no matchingtool-result) and injects synthetic error results to keep the session alive.Why here?
normalizeMessagescan drop messages with empty content, breaking tool-call/tool-result pairs. By the time the request leavesProviderTransform.message(), orphaned tool-calls cause a permanent 400 from Anthropic — the corrupted history replays from SQLite on every retry, making the session unrecoverable.The function is a no-op when pairs are intact (single pass, Map+Set lookups, zero allocation on happy path). It catches corruption from 6 independent vectors in the pipeline —
normalizeMessagesfiltering (V1), error-skip logic intoModelMessages(V2), lost step boundaries during retry (V3), tool-error race conditions (V4),filterCompactedcutting pairs (V5), and the AI SDK producing empty assistant messages (V6). Full analysis in #21326 comment.How did you verify your code works?
ProviderTransform.message()bun typecheckcleanChecklist