fix(auth): coerce empty-string optional URL fields to None in OAuthClientMetadata#2404
Conversation
There was a problem hiding this comment.
LGTM — clean, well-tested Postel's-law fix that exactly mirrors the existing normalize_token_type pattern in the same file.
Extended reasoning...
Overview
The PR adds a single field_validator on OAuthClientMetadata that coerces empty-string values to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri). Because OAuthClientInformationFull inherits from OAuthClientMetadata, the coercion automatically covers parsed DCR responses. The diff touches two files: the model definition and its test suite.
Security risks
None. The five fields affected are informational metadata; they are not used in any authentication or authorization decision. The critical redirect_uris field is intentionally excluded. The validator only widens accepted input for values that were previously rejected with a ValidationError (i.e., ""), and the negative-case test confirms that non-empty invalid URLs still raise.
Level of scrutiny
Low. This is a small, non-breaking compatibility shim consistent with the existing normalize_token_type validator already present in the same file. The pattern is idiomatic Pydantic v2 (mode="before", @classmethod). Test coverage is thorough: parametrized per-field, all-at-once, valid-URL passthrough, subclass inheritance, and a negative case.
Other factors
No outstanding reviewer comments; no prior reviews from any reviewer. The bug hunting system found no issues. This is a direct fix for a real-world interoperability problem observed with multiple production OAuth servers.
maxisbey
left a comment
There was a problem hiding this comment.
Two test-organization asks:
-
File location — these are pure model-validation tests for
src/mcp/shared/auth.py, so per the tests-mirror-source convention they belong intests/shared/test_auth.pyrather thantests/client/test_auth.py. I know the existingOAuthClientMetadatatests are already in the client file, but since this is a new self-contained block it's a good chance to land it in the right place rather than grow the misplacement. -
Drop the
Test...class — project convention is plain top-level functions instead ofTest-prefixed classes. The parametrize and the docstring carry over fine as a module-level comment + standalone functions.
There was a problem hiding this comment.
LGTM — clean, well-tested Postel's law fix.
Extended reasoning...
Overview
This PR adds a single (mode=before) to that coerces empty-string values to for five optional URL fields (, , , , ). The change is tiny (18 lines of production code) and touches only the auth model and its tests.
Security Risks
None. The affected fields are informational metadata URLs that are currently unused by the SDK. The coercion only widens accepted input from rejected to None — it does not bypass any authentication, authorization, or cryptographic check.
Level of Scrutiny
Low. This is a straightforward compatibility shim at a deserialization boundary, matching the existing normalize_token_type validator pattern already present in the same file. The logic is a single equality check against an empty string.
Other Factors
Tests are comprehensive: per-field parametrized coverage, all-fields-together, valid URL passthrough, inheritance through OAuthClientInformationFull, and a negative case confirming non-empty invalid URLs still raise. No existing behavior changes; this is purely additive.
438e176 to
bfc7da3
Compare
There was a problem hiding this comment.
LGTM — clean, well-tested bug fix.
Extended reasoning...
Overview
The PR touches three files: a doc clarification in CLAUDE.md, a new field_validator on OAuthClientMetadata in src/mcp/shared/auth.py, and corresponding tests in tests/shared/test_auth.py. The validator coerces empty strings to None for five optional URL fields (client_uri, logo_uri, tos_uri, policy_uri, jwks_uri), matching the same pattern as the pre-existing normalize_token_type validator on OAuthToken in the same file.
Security risks
None. The change only widens accepted input at deserialization — previously-rejected empty strings now parse as None. Valid URLs, None, and omitted keys are unaffected. Non-empty invalid URLs still raise ValidationError (confirmed by test). No auth logic or permissions are modified.
Level of scrutiny
Low. This is a targeted, non-breaking compatibility fix at the deserialization boundary. It is self-contained, follows an established codebase pattern, and does not touch any security-sensitive logic paths (no auth decisions, token handling, or permission checks).
Other factors
Test coverage is thorough: parametrized per-field, all-fields-together, valid-URL passthrough, inheritance through OAuthClientInformationFull, and a negative case. No bugs were found by the automated bug hunting system. The CLAUDE.md update is a minor clarification with no behavioral impact.
There was a problem hiding this comment.
LGTM — clean, well-tested Postel's-law fix following the existing normalize_token_type pattern.
Extended reasoning...
Overview
The PR modifies three files: src/mcp/shared/auth.py (adds a field_validator to OAuthClientMetadata), tests/shared/test_auth.py (adds 5 targeted test functions), and AGENTS.md (minor wording clarification on test-class naming). The scope is small and self-contained.
Security risks
None introduced. The change widens accepted input ("" → None) at a deserialization boundary, which is strictly less restrictive than before. It cannot cause auth bypass or data exposure — the coercion only applies to informational/display-URL fields that the SDK stores but doesn't act on for security decisions.
Level of scrutiny
Low. This is a compatibility shim at a parse boundary, matching an identical existing validator (normalize_token_type) in the same file. The logic is a single equality check. Test coverage is thorough: parametrized per-field, all-together, valid-URL passthrough, subclass inheritance, and a negative case for non-empty invalid URLs.
Other factors
The pre-existing gap in OAuthMetadata and ProtectedResourceMetadata flagged by the bug hunter is real but not introduced by this PR — it predates it. That gap warrants a follow-up, not a block here. No outstanding reviewer comments; no CODEOWNERS conflict; changes are non-breaking by construction.
| @field_validator( | ||
| "client_uri", | ||
| "logo_uri", | ||
| "tos_uri", | ||
| "policy_uri", | ||
| "jwks_uri", | ||
| mode="before", | ||
| ) | ||
| @classmethod | ||
| def _empty_string_optional_url_to_none(cls, v: object) -> object: | ||
| # RFC 7591 §2 marks these URL fields OPTIONAL. Some authorization servers | ||
| # echo omitted metadata back as "" instead of dropping the keys, which | ||
| # AnyHttpUrl would otherwise reject — throwing away an otherwise valid | ||
| # registration response. Treat "" as absent. | ||
| if v == "": | ||
| return None | ||
| return v |
There was a problem hiding this comment.
🟣 This PR correctly fixes empty-string coercion for OAuthClientMetadata, but OAuthMetadata (RFC 8414) and ProtectedResourceMetadata (RFC 9728) in the same file have the identical failure mode for their optional URL fields — a pre-existing gap now made visible by the inconsistency. The same non-compliant servers that echo "" in DCR responses could equally send "" in their /.well-known/oauth-authorization-server or protected resource metadata responses, causing ValidationError when parsing those server-returned payloads.
Extended reasoning...
What the bug is and how it manifests
The PR correctly identifies that some real-world OAuth servers echo omitted optional URL fields as empty strings rather than dropping the keys, and that Pydantic's AnyHttpUrl rejects "" — causing ValidationError at parse time. The fix (a before-mode field_validator that maps "" → None) is applied to OAuthClientMetadata, but two sibling models in the same file are left unpatched:
- OAuthMetadata (RFC 8414 AS metadata): optional URL fields registration_endpoint, service_documentation, op_policy_uri, op_tos_uri, revocation_endpoint, introspection_endpoint
- ProtectedResourceMetadata (RFC 9728): optional URL fields jwks_uri, resource_documentation, resource_policy_uri, resource_tos_uri
The specific code path that triggers it
When the MCP client fetches /.well-known/oauth-authorization-server (or equivalent), the response JSON is parsed via OAuthMetadata.model_validate(). If the server returns {"issuer": "https://example.com", "authorization_endpoint": "...", "token_endpoint": "...", "registration_endpoint": ""}, Pydantic's AnyHttpUrl validator fires on registration_endpoint before any field_validator runs (there is none), sees an empty string, and raises ValidationError — even though registration_endpoint is marked OPTIONAL in RFC 8414 §2. The same applies to the ProtectedResourceMetadata parse at the protected resource metadata endpoint.
Why existing code doesn't prevent it
OAuthMetadata and ProtectedResourceMetadata have no field_validator at all for their optional AnyHttpUrl fields. The PR adds one only to OAuthClientMetadata. Pydantic's default behavior for AnyHttpUrl | None fields is to accept None or a valid URL; an empty string is neither, so it raises. There is no upstream fallback.
Impact
Any MCP client talking to a non-compliant-but-widely-deployed authorization server that echoes "" for omitted AS-metadata or resource-metadata fields will fail during the discovery phase — before client registration even begins. This is a harder failure than the DCR failure fixed by the PR, because it blocks all OAuth flows, not just re-registration.
How to fix it
Apply the same field_validator pattern used for OAuthClientMetadata to OAuthMetadata (covering registration_endpoint, service_documentation, op_policy_uri, op_tos_uri, revocation_endpoint, introspection_endpoint) and to ProtectedResourceMetadata (covering jwks_uri, resource_documentation, resource_policy_uri, resource_tos_uri).
Step-by-step proof
- A server returns HTTP 200 with body: {"issuer": "https://as.example.com", "authorization_endpoint": "https://as.example.com/auth", "token_endpoint": "https://as.example.com/token", "registration_endpoint": ""}
- Client calls OAuthMetadata.model_validate(response_json)
- Pydantic processes field registration_endpoint with value ""
- Field type is AnyHttpUrl | None; "" is not None and not a valid URL
- ValidationError is raised: registration_endpoint → URL input should be a string or URL
- The MCP client's OAuth discovery flow fails, and no connection can be established
Addressing the refutation
The refuting verifier correctly notes the PR does not introduce this bug — it pre-dates the PR. However, this PR directly modifies auth.py and adds the exact fix pattern needed. The PR's description explicitly cites Postel's law and real-world non-compliant servers as motivation. Those same servers' AS metadata and resource metadata endpoints are equally likely to echo "" for optional fields. The code review is the appropriate venue to flag that the fix was applied incompletely across the three related models in the same file.
Coerce
""toNonefor the five OPTIONAL URL fields onOAuthClientMetadataso that DCR responses from servers that echo omitted metadata as empty strings parse successfully.Motivation and Context
RFC 7591 §2 marks
client_uri,logo_uri,tos_uri,policy_uri, andjwks_urias OPTIONAL. Some authorization servers in the wild (observed: Udemy, deepsense.ai, Apify, windsor.ai, firecrawl, and others) echo the client's omitted metadata back as""instead of dropping the keys:{"client_id": "abc123", "client_uri": "", "logo_uri": "", "tos_uri": "", "policy_uri": "", ...}AnyHttpUrlrejects"", sohandle_registration_responseraisesValidationErrorand callers discard an otherwise valid registration — the server returned 201 with a realclient_id.The servers are technically non-compliant (an empty string is not a valid URL), but the only reasonable interpretation of
""for an OPTIONAL field is "absent". This applies Postel's law at the deserialization boundary, matching the existingnormalize_token_typevalidator onOAuthTokenin the same file. BecauseOAuthClientInformationFullinherits fromOAuthClientMetadata, the coercion applies to parsed DCR responses as well.How Has This Been Tested?
New
TestOAuthClientMetadataEmptyUrlCoercioncovers per-field coercion (parametrized over all five fields), all-fields-together, valid-URL passthrough, inheritance throughOAuthClientInformationFull(the DCR-response model), and a negative case confirming non-empty invalid URLs still raise.Breaking Changes
None. This only widens accepted input — previously-rejected
""values now succeed asNone. Valid URLs,None, and omitted keys are unchanged.Types of changes
Checklist
Additional context
RFC 7591 §2: https://datatracker.ietf.org/doc/html/rfc7591#section-2
v1.x backport: #2405