Skip to content

fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2404

Merged
felixweinberger merged 3 commits intomainfrom
fweinberger/dcr-empty-url-coerce
Apr 13, 2026
Merged

fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2404
felixweinberger merged 3 commits intomainfrom
fweinberger/dcr-empty-url-coerce

Conversation

@felixweinberger
Copy link
Copy Markdown
Contributor

@felixweinberger felixweinberger commented Apr 8, 2026

Coerce "" to None for the five OPTIONAL URL fields on OAuthClientMetadata so that DCR responses from servers that echo omitted metadata as empty strings parse successfully.

Motivation and Context

RFC 7591 §2 marks client_uri, logo_uri, tos_uri, policy_uri, and jwks_uri as OPTIONAL. Some authorization servers in the wild (observed: Udemy, deepsense.ai, Apify, windsor.ai, firecrawl, and others) echo the client's omitted metadata back as "" instead of dropping the keys:

{"client_id": "abc123", "client_uri": "", "logo_uri": "", "tos_uri": "", "policy_uri": "", ...}

AnyHttpUrl rejects "", so handle_registration_response raises ValidationError and callers discard an otherwise valid registration — the server returned 201 with a real client_id.

The servers are technically non-compliant (an empty string is not a valid URL), but the only reasonable interpretation of "" for an OPTIONAL field is "absent". This applies Postel's law at the deserialization boundary, matching the existing normalize_token_type validator on OAuthToken in the same file. Because OAuthClientInformationFull inherits from OAuthClientMetadata, the coercion applies to parsed DCR responses as well.

How Has This Been Tested?

New TestOAuthClientMetadataEmptyUrlCoercion covers per-field coercion (parametrized over all five fields), all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull (the DCR-response model), and a negative case confirming non-empty invalid URLs still raise.

Breaking Changes

None. This only widens accepted input — previously-rejected "" values now succeed as None. Valid URLs, None, and omitted keys are unchanged.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

RFC 7591 §2: https://datatracker.ietf.org/doc/html/rfc7591#section-2

v1.x backport: #2405

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, well-tested Postel's-law fix that exactly mirrors the existing normalize_token_type pattern in the same file.

Extended reasoning...

Overview

The PR adds a single field_validator on OAuthClientMetadata that coerces empty-string values to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri). Because OAuthClientInformationFull inherits from OAuthClientMetadata, the coercion automatically covers parsed DCR responses. The diff touches two files: the model definition and its test suite.

Security risks

None. The five fields affected are informational metadata; they are not used in any authentication or authorization decision. The critical redirect_uris field is intentionally excluded. The validator only widens accepted input for values that were previously rejected with a ValidationError (i.e., ""), and the negative-case test confirms that non-empty invalid URLs still raise.

Level of scrutiny

Low. This is a small, non-breaking compatibility shim consistent with the existing normalize_token_type validator already present in the same file. The pattern is idiomatic Pydantic v2 (mode="before", @classmethod). Test coverage is thorough: parametrized per-field, all-at-once, valid-URL passthrough, subclass inheritance, and a negative case.

Other factors

No outstanding reviewer comments; no prior reviews from any reviewer. The bug hunting system found no issues. This is a direct fix for a real-world interoperability problem observed with multiple production OAuth servers.

Copy link
Copy Markdown
Contributor

@maxisbey maxisbey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two test-organization asks:

  1. File location — these are pure model-validation tests for src/mcp/shared/auth.py, so per the tests-mirror-source convention they belong in tests/shared/test_auth.py rather than tests/client/test_auth.py. I know the existing OAuthClientMetadata tests are already in the client file, but since this is a new self-contained block it's a good chance to land it in the right place rather than grow the misplacement.

  2. Drop the Test... class — project convention is plain top-level functions instead of Test-prefixed classes. The parametrize and the docstring carry over fine as a module-level comment + standalone functions.

AI Disclaimer

maxisbey
maxisbey previously approved these changes Apr 13, 2026
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, well-tested Postel's law fix.

Extended reasoning...

Overview

This PR adds a single (mode=before) to that coerces empty-string values to for five optional URL fields (, , , , ). The change is tiny (18 lines of production code) and touches only the auth model and its tests.

Security Risks

None. The affected fields are informational metadata URLs that are currently unused by the SDK. The coercion only widens accepted input from rejected to None — it does not bypass any authentication, authorization, or cryptographic check.

Level of Scrutiny

Low. This is a straightforward compatibility shim at a deserialization boundary, matching the existing normalize_token_type validator pattern already present in the same file. The logic is a single equality check against an empty string.

Other Factors

Tests are comprehensive: per-field parametrized coverage, all-fields-together, valid URL passthrough, inheritance through OAuthClientInformationFull, and a negative case confirming non-empty invalid URLs still raise. No existing behavior changes; this is purely additive.

@felixweinberger felixweinberger force-pushed the fweinberger/dcr-empty-url-coerce branch from 438e176 to bfc7da3 Compare April 13, 2026 14:25
Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, well-tested bug fix.

Extended reasoning...

Overview

The PR touches three files: a doc clarification in CLAUDE.md, a new field_validator on OAuthClientMetadata in src/mcp/shared/auth.py, and corresponding tests in tests/shared/test_auth.py. The validator coerces empty strings to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri), matching the same pattern as the pre-existing normalize_token_type validator on OAuthToken in the same file.

Security risks

None. The change only widens accepted input at deserialization — previously-rejected empty strings now parse as None. Valid URLs, None, and omitted keys are unaffected. Non-empty invalid URLs still raise ValidationError (confirmed by test). No auth logic or permissions are modified.

Level of scrutiny

Low. This is a targeted, non-breaking compatibility fix at the deserialization boundary. It is self-contained, follows an established codebase pattern, and does not touch any security-sensitive logic paths (no auth decisions, token handling, or permission checks).

Other factors

Test coverage is thorough: parametrized per-field, all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull, and a negative case. No bugs were found by the automated bug hunting system. The CLAUDE.md update is a minor clarification with no behavioral impact.

Copy link
Copy Markdown
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, well-tested Postel's-law fix following the existing normalize_token_type pattern.

Extended reasoning...

Overview

The PR modifies three files: src/mcp/shared/auth.py (adds a field_validator to OAuthClientMetadata), tests/shared/test_auth.py (adds 5 targeted test functions), and AGENTS.md (minor wording clarification on test-class naming). The scope is small and self-contained.

Security risks

None introduced. The change widens accepted input ("" → None) at a deserialization boundary, which is strictly less restrictive than before. It cannot cause auth bypass or data exposure — the coercion only applies to informational/display-URL fields that the SDK stores but doesn't act on for security decisions.

Level of scrutiny

Low. This is a compatibility shim at a parse boundary, matching an identical existing validator (normalize_token_type) in the same file. The logic is a single equality check. Test coverage is thorough: parametrized per-field, all-together, valid-URL passthrough, subclass inheritance, and a negative case for non-empty invalid URLs.

Other factors

The pre-existing gap in OAuthMetadata and ProtectedResourceMetadata flagged by the bug hunter is real but not introduced by this PR — it predates it. That gap warrants a follow-up, not a block here. No outstanding reviewer comments; no CODEOWNERS conflict; changes are non-breaking by construction.

Comment on lines +70 to +86
@field_validator(
"client_uri",
"logo_uri",
"tos_uri",
"policy_uri",
"jwks_uri",
mode="before",
)
@classmethod
def _empty_string_optional_url_to_none(cls, v: object) -> object:
# RFC 7591 §2 marks these URL fields OPTIONAL. Some authorization servers
# echo omitted metadata back as "" instead of dropping the keys, which
# AnyHttpUrl would otherwise reject — throwing away an otherwise valid
# registration response. Treat "" as absent.
if v == "":
return None
return v
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟣 This PR correctly fixes empty-string coercion for OAuthClientMetadata, but OAuthMetadata (RFC 8414) and ProtectedResourceMetadata (RFC 9728) in the same file have the identical failure mode for their optional URL fields — a pre-existing gap now made visible by the inconsistency. The same non-compliant servers that echo "" in DCR responses could equally send "" in their /.well-known/oauth-authorization-server or protected resource metadata responses, causing ValidationError when parsing those server-returned payloads.

Extended reasoning...

What the bug is and how it manifests

The PR correctly identifies that some real-world OAuth servers echo omitted optional URL fields as empty strings rather than dropping the keys, and that Pydantic's AnyHttpUrl rejects "" — causing ValidationError at parse time. The fix (a before-mode field_validator that maps "" → None) is applied to OAuthClientMetadata, but two sibling models in the same file are left unpatched:

  • OAuthMetadata (RFC 8414 AS metadata): optional URL fields registration_endpoint, service_documentation, op_policy_uri, op_tos_uri, revocation_endpoint, introspection_endpoint
  • ProtectedResourceMetadata (RFC 9728): optional URL fields jwks_uri, resource_documentation, resource_policy_uri, resource_tos_uri

The specific code path that triggers it

When the MCP client fetches /.well-known/oauth-authorization-server (or equivalent), the response JSON is parsed via OAuthMetadata.model_validate(). If the server returns {"issuer": "https://example.com", "authorization_endpoint": "...", "token_endpoint": "...", "registration_endpoint": ""}, Pydantic's AnyHttpUrl validator fires on registration_endpoint before any field_validator runs (there is none), sees an empty string, and raises ValidationError — even though registration_endpoint is marked OPTIONAL in RFC 8414 §2. The same applies to the ProtectedResourceMetadata parse at the protected resource metadata endpoint.

Why existing code doesn't prevent it

OAuthMetadata and ProtectedResourceMetadata have no field_validator at all for their optional AnyHttpUrl fields. The PR adds one only to OAuthClientMetadata. Pydantic's default behavior for AnyHttpUrl | None fields is to accept None or a valid URL; an empty string is neither, so it raises. There is no upstream fallback.

Impact

Any MCP client talking to a non-compliant-but-widely-deployed authorization server that echoes "" for omitted AS-metadata or resource-metadata fields will fail during the discovery phase — before client registration even begins. This is a harder failure than the DCR failure fixed by the PR, because it blocks all OAuth flows, not just re-registration.

How to fix it

Apply the same field_validator pattern used for OAuthClientMetadata to OAuthMetadata (covering registration_endpoint, service_documentation, op_policy_uri, op_tos_uri, revocation_endpoint, introspection_endpoint) and to ProtectedResourceMetadata (covering jwks_uri, resource_documentation, resource_policy_uri, resource_tos_uri).

Step-by-step proof

  1. A server returns HTTP 200 with body: {"issuer": "https://as.example.com", "authorization_endpoint": "https://as.example.com/auth", "token_endpoint": "https://as.example.com/token", "registration_endpoint": ""}
  2. Client calls OAuthMetadata.model_validate(response_json)
  3. Pydantic processes field registration_endpoint with value ""
  4. Field type is AnyHttpUrl | None; "" is not None and not a valid URL
  5. ValidationError is raised: registration_endpoint → URL input should be a string or URL
  6. The MCP client's OAuth discovery flow fails, and no connection can be established

Addressing the refutation

The refuting verifier correctly notes the PR does not introduce this bug — it pre-dates the PR. However, this PR directly modifies auth.py and adds the exact fix pattern needed. The PR's description explicitly cites Postel's law and real-world non-compliant servers as motivation. Those same servers' AS metadata and resource metadata endpoints are equally likely to echo "" for optional fields. The code review is the appropriate venue to flag that the fix was applied incompletely across the three related models in the same file.

@felixweinberger felixweinberger merged commit 2dfb51a into main Apr 13, 2026
55 of 56 checks passed
@felixweinberger felixweinberger deleted the fweinberger/dcr-empty-url-coerce branch April 13, 2026 14:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants