Skip to content

Security: Hardening against RCE via Signed Msgpack Migration and Path Sanitization#6589

Open
JoshuaProvoste wants to merge 6 commits intogoogleapis:mainfrom
JoshuaProvoste:security/fix-rce-msgpack-migration
Open

Security: Hardening against RCE via Signed Msgpack Migration and Path Sanitization#6589
JoshuaProvoste wants to merge 6 commits intogoogleapis:mainfrom
JoshuaProvoste:security/fix-rce-msgpack-migration

Conversation

@JoshuaProvoste
Copy link
Copy Markdown

Compliance with CONTRIBUTING.md:

Consistently with the project guidelines, this PR includes:

  • Security Hardening: Implemented a "Double Lock" architecture (Signed Msgpack + Path Sanitization) to eliminate multiple RCE vectors.
  • Verified Fix: Successfully reproduced the RCE vulnerabilities (Vector #2: Staging Bucket Spec Injection and Vector #3: UNC/SMB Path Injection) and verified their neutralization by the new security infrastructure.
  • Code Hygiene: Enforced Google-specific code style using isort and pyink across all modified files.
  • Dependency Management: Updated setup.py to include msgpack as a core security requirement.

Description

This PR addresses critical security vulnerabilities in the google-cloud-aiplatform SDK involving insecure deserialization and path injection. The current implementation relies on pickle and cloudpickle for loading model artifacts and agent states, which allows for Remote Code Execution (RCE) via malicious payloads or SMB/UNC path redirection.

To remediate this, I have transitioned the SDK to a "Secure by Default" architecture using Signed Msgpack. All artifacts are now verified via HMAC-SHA256 signatures before processing, and networked paths are strictly sanitized to block external injection vectors.

Key Benefits:

  1. System Integrity: HMAC-SHA256 signatures ensure that only artifacts generated by the SDK (or authorized keys) can be loaded.
  2. RCE Neutralization: Passive data-only parsing via msgpack replaces execution-capable formats like pickle.
  3. Path Security: Strict URI validation prevents NTLM theft and involuntary loading from remote malicious shares.
  4. Modern Standards: Aligns Vertex AI with industry best practices for artifact security and integrity.

Technical Implementation Details

  • google/cloud/aiplatform/utils/security_utils.py:
    • Created a new core security module for sign_blob and verify_blob using HMAC-SHA256.
    • Implemented validate_uri to block UNC/SMB paths (e.g., gs:////attacker/share).
  • google/cloud/aiplatform/prediction/:
    • Refactored SklearnPredictor and XgboostPredictor to prioritize model.msgpack and explicitly block insecure .pkl and .joblib sinks.
  • vertexai/agent_engines/ & vertexai/reasoning_engines/:
    • Removed cloudpickle dependency and replaced it with a signed msgpack manifest for state persistence.
  • google/cloud/aiplatform/utils/gcs_utils.py:
    • Integrated validate_uri into the central GCS path validation logic to protect all download operations.
  • setup.py:
    • Added msgpack >= 1.0.0 to install_requires.

Verification Performed

  • Reproduced Vulnerabilities:
    • Confirmed that a malicious staging_bucket parameter could trigger RCE via pickle injection (Vector #2).
    • Confirmed that a crafted UNC path in AIP_STORAGE_URI could leak NTLM hashes (Vector #3).
  • Validated Fix:
    • Verified that the new gate rejects unsigned or malformed msgpack payloads with a RuntimeError.
    • Confirmed that model precision and agent state reconstruction are preserved with 100% fidelity.
    • Verified that UNC/SMB paths are blocked at the SDK level before any network request is made.
  • Code Quality: Successfully ran isort and pyink on all reformatted files.

Checklist

  • Followed project guidelines for code hygiene.
  • Includes security-focused refactoring of critical sinks.
  • Neutralizes RCE Vector #2 and Vector #3.
  • Verified backward compatibility implications (intentional breaking change for insecure formats).

@JoshuaProvoste JoshuaProvoste requested a review from a team as a code owner April 14, 2026 17:35
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: vertex-ai Issues related to the googleapis/python-aiplatform API. labels Apr 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: vertex-ai Issues related to the googleapis/python-aiplatform API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant