Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 16 additions & 1 deletion extensions/EXTENSION-USER-GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,7 @@ In addition to extension-specific environment variables (`SPECKIT_{EXT_ID}_*`),
| Variable | Description | Default |
|----------|-------------|---------|
| `SPECKIT_CATALOG_URL` | Override the full catalog stack with a single URL (backward compat) | Built-in default stack |
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub API token for downloads | None |
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub token for authenticated requests to GitHub-hosted URLs (`raw.githubusercontent.com`, `github.com`, `api.github.com`). Required when your catalog JSON or extension ZIPs are hosted in a private GitHub repository. | None |
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs enumerate GitHub hosts as raw.githubusercontent.com, github.com, and api.github.com, but the implementation also treats codeload.github.com as GitHub-owned (and tests rely on it). Please either include codeload.github.com in this list or adjust the wording so the parenthetical isn’t interpreted as exhaustive.

Suggested change
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub token for authenticated requests to GitHub-hosted URLs (`raw.githubusercontent.com`, `github.com`, `api.github.com`). Required when your catalog JSON or extension ZIPs are hosted in a private GitHub repository. | None |
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub token for authenticated requests to GitHub-hosted URLs (`raw.githubusercontent.com`, `github.com`, `api.github.com`, `codeload.github.com`). Required when your catalog JSON or extension ZIPs are hosted in a private GitHub repository. | None |

Copilot uses AI. Check for mistakes.

#### Example: Using a custom catalog for testing

Expand All @@ -433,6 +433,21 @@ export SPECKIT_CATALOG_URL="http://localhost:8000/catalog.json"
export SPECKIT_CATALOG_URL="https://example.com/staging/catalog.json"
```

#### Example: Using a private GitHub-hosted catalog

```bash
# Authenticate with a token (gh CLI, PAT, or GITHUB_TOKEN in CI)
export GITHUB_TOKEN=$(gh auth token)

# Search a private catalog added via `specify extension catalog add`
specify extension search jira

# Install from a private catalog
specify extension add jira-sync
```

The token is attached automatically to requests targeting GitHub domains. Non-GitHub catalog URLs are always fetched without credentials.

---

## Extension Catalogs
Expand Down
22 changes: 19 additions & 3 deletions presets/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,9 +93,25 @@ See [scaffold/](scaffold/) for a scaffold you can copy to create your own preset

## Environment Variables

| Variable | Description |
|----------|-------------|
| `SPECKIT_PRESET_CATALOG_URL` | Override the catalog URL (replaces all defaults) |
| Variable | Description | Default |
|----------|-------------|---------|
| `SPECKIT_PRESET_CATALOG_URL` | Override the full catalog stack with a single URL (replaces all defaults) | Built-in default stack |
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub token for authenticated requests to GitHub-hosted URLs (`raw.githubusercontent.com`, `github.com`, `api.github.com`). Required when your catalog JSON or preset ZIPs are hosted in a private GitHub repository. | None |
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs list GitHub-hosted URLs as raw.githubusercontent.com, github.com, and api.github.com, but the code also supports codeload.github.com (GitHub archive redirect target). Please add codeload.github.com here or rephrase to avoid implying the list is complete.

Suggested change
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub token for authenticated requests to GitHub-hosted URLs (`raw.githubusercontent.com`, `github.com`, `api.github.com`). Required when your catalog JSON or preset ZIPs are hosted in a private GitHub repository. | None |
| `GH_TOKEN` / `GITHUB_TOKEN` | GitHub token for authenticated requests to GitHub-hosted URLs (`raw.githubusercontent.com`, `github.com`, `api.github.com`, `codeload.github.com`). Required when your catalog JSON or preset ZIPs are hosted in a private GitHub repository. | None |

Copilot uses AI. Check for mistakes.

#### Example: Using a private GitHub-hosted catalog

```bash
# Authenticate with a token (gh CLI, PAT, or GITHUB_TOKEN in CI)
export GITHUB_TOKEN=$(gh auth token)

# Search a private catalog added via `specify preset catalog add`
specify preset search my-template

# Install from a private catalog
specify preset add my-template
```

The token is attached automatically to requests targeting GitHub domains. Non-GitHub catalog URLs are always fetched without credentials.
Comment on lines +96 to +114
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR title/description focus on ExtensionCatalog, but this change also introduces the same GitHub-token behavior for PresetCatalog and documents it here. Please update the PR title and/or description to reflect that presets are included as well, so reviewers and release notes capture the full scope.

Copilot uses AI. Check for mistakes.

## Configuration Files

Expand Down
75 changes: 75 additions & 0 deletions src/specify_cli/_github_http.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
"""Shared GitHub-authenticated HTTP helpers.

Used by both ExtensionCatalog and PresetCatalog to attach
GITHUB_TOKEN / GH_TOKEN credentials to requests targeting
GitHub-hosted domains, while preventing token leakage to
third-party hosts on redirects.
"""

import os
import urllib.request
from urllib.parse import urlparse
from typing import Dict

# GitHub-owned hostnames that should receive the Authorization header.
# Includes codeload.github.com because GitHub archive URL downloads
# (e.g. /archive/refs/tags/<tag>.zip) redirect there and require auth
# for private repositories.
GITHUB_HOSTS = frozenset({
"raw.githubusercontent.com",
"github.com",
"api.github.com",
"codeload.github.com",
})


def build_github_request(url: str) -> urllib.request.Request:
"""Build a urllib Request, adding a GitHub auth header when available.

Reads GITHUB_TOKEN or GH_TOKEN from the environment and attaches an
``Authorization: token <value>`` header when the target hostname is one
of the known GitHub-owned domains. Non-GitHub URLs are returned as plain
requests so credentials are never leaked to third-party hosts.
"""
headers: Dict[str, str] = {}
github_token = (os.environ.get("GITHUB_TOKEN") or "").strip()
gh_token = (os.environ.get("GH_TOKEN") or "").strip()
token = github_token or gh_token or None
hostname = (urlparse(url).hostname or "").lower()
if token and hostname in GITHUB_HOSTS:
headers["Authorization"] = f"token {token}"
return urllib.request.Request(url, headers=headers)


class _StripAuthOnRedirect(urllib.request.HTTPRedirectHandler):
"""Redirect handler that drops the Authorization header when leaving GitHub.

Prevents token leakage to CDNs or other third-party hosts that GitHub
may redirect to (e.g. S3 for release asset downloads, objects.githubusercontent.com).
Auth is preserved as long as the redirect target remains within GITHUB_HOSTS.
"""

def redirect_request(self, req, fp, code, msg, headers, newurl):
new_req = super().redirect_request(req, fp, code, msg, headers, newurl)
if new_req is not None:
hostname = (urlparse(newurl).hostname or "").lower()
if hostname not in GITHUB_HOSTS:
new_req.headers.pop("Authorization", None)
Comment on lines +53 to +57
Copy link

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_StripAuthOnRedirect.redirect_request() calls super().redirect_request() first; urllib’s default redirect implementation strips the Authorization header on cross-host redirects. That means redirects between allowed GitHub hosts (e.g. github.com → codeload.github.com) will lose the token, contradicting the docstring and likely breaking private repo archive downloads. Consider capturing the original Authorization header, calling super(), then re-attaching it when the redirect target hostname is still in GITHUB_HOSTS (and explicitly removing it otherwise). Adding a focused unit test for the github.com → codeload.github.com redirect case would prevent regressions.

Suggested change
new_req = super().redirect_request(req, fp, code, msg, headers, newurl)
if new_req is not None:
hostname = (urlparse(newurl).hostname or "").lower()
if hostname not in GITHUB_HOSTS:
new_req.headers.pop("Authorization", None)
original_auth = req.get_header("Authorization")
new_req = super().redirect_request(req, fp, code, msg, headers, newurl)
if new_req is not None:
hostname = (urlparse(newurl).hostname or "").lower()
if hostname in GITHUB_HOSTS:
if original_auth:
new_req.add_unredirected_header("Authorization", original_auth)
else:
new_req.headers.pop("Authorization", None)
new_req.unredirected_hdrs.pop("Authorization", None)

Copilot uses AI. Check for mistakes.
return new_req


def open_github_url(url: str, timeout: int = 10):
"""Open a URL with GitHub auth, stripping the header on cross-host redirects.

When the request carries an Authorization header, a custom redirect
handler drops that header if the redirect target is not a GitHub-owned
domain, preventing token leakage to CDNs or other third-party hosts
that GitHub may redirect to (e.g. S3 for release asset downloads).
"""
req = build_github_request(url)

if not req.get_header("Authorization"):
return urllib.request.urlopen(req, timeout=timeout)

opener = urllib.request.build_opener(_StripAuthOnRedirect)
return opener.open(req, timeout=timeout)
25 changes: 19 additions & 6 deletions src/specify_cli/extensions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1411,6 +1411,22 @@ def _validate_catalog_url(self, url: str) -> None:
if not parsed.netloc:
raise ValidationError("Catalog URL must be a valid URL with a host.")

def _make_request(self, url: str):
"""Build a urllib Request, adding a GitHub auth header when available.

Delegates to :func:`specify_cli._github_http.build_github_request`.
"""
from specify_cli._github_http import build_github_request
return build_github_request(url)

def _open_url(self, url: str, timeout: int = 10):
"""Open a URL with GitHub auth, stripping the header on cross-host redirects.

Delegates to :func:`specify_cli._github_http.open_github_url`.
"""
from specify_cli._github_http import open_github_url
return open_github_url(url, timeout)

def _load_catalog_config(self, config_path: Path) -> Optional[List[CatalogEntry]]:
"""Load catalog stack configuration from a YAML file.

Expand Down Expand Up @@ -1567,7 +1583,6 @@ def _fetch_single_catalog(self, entry: CatalogEntry, force_refresh: bool = False
Raises:
ExtensionError: If catalog cannot be fetched or has invalid format
"""
import urllib.request
import urllib.error

# Determine cache file paths (backward compat for default catalog)
Expand Down Expand Up @@ -1601,7 +1616,7 @@ def _fetch_single_catalog(self, entry: CatalogEntry, force_refresh: bool = False

# Fetch from network
try:
with urllib.request.urlopen(entry.url, timeout=10) as response:
with self._open_url(entry.url, timeout=10) as response:
catalog_data = json.loads(response.read())

if "schema_version" not in catalog_data or "extensions" not in catalog_data:
Expand Down Expand Up @@ -1715,10 +1730,9 @@ def fetch_catalog(self, force_refresh: bool = False) -> Dict[str, Any]:
catalog_url = self.get_catalog_url()

try:
import urllib.request
import urllib.error

with urllib.request.urlopen(catalog_url, timeout=10) as response:
with self._open_url(catalog_url, timeout=10) as response:
catalog_data = json.loads(response.read())

# Validate catalog structure
Expand Down Expand Up @@ -1829,7 +1843,6 @@ def download_extension(self, extension_id: str, target_dir: Optional[Path] = Non
Raises:
ExtensionError: If extension not found or download fails
"""
import urllib.request
import urllib.error

# Get extension info from catalog
Expand Down Expand Up @@ -1861,7 +1874,7 @@ def download_extension(self, extension_id: str, target_dir: Optional[Path] = Non

# Download the ZIP file
try:
with urllib.request.urlopen(download_url, timeout=60) as response:
with self._open_url(download_url, timeout=60) as response:
zip_data = response.read()

zip_path.write_bytes(zip_data)
Expand Down
29 changes: 19 additions & 10 deletions src/specify_cli/presets.py
Original file line number Diff line number Diff line change
Expand Up @@ -1178,6 +1178,22 @@ def _validate_catalog_url(self, url: str) -> None:
"Catalog URL must be a valid URL with a host."
)

def _make_request(self, url: str):
"""Build a urllib Request, adding a GitHub auth header when available.

Delegates to :func:`specify_cli._github_http.build_github_request`.
"""
from specify_cli._github_http import build_github_request
return build_github_request(url)

def _open_url(self, url: str, timeout: int = 10):
"""Open a URL with GitHub auth, stripping the header on cross-host redirects.

Delegates to :func:`specify_cli._github_http.open_github_url`.
"""
from specify_cli._github_http import open_github_url
return open_github_url(url, timeout)

def _load_catalog_config(self, config_path: Path) -> Optional[List[PresetCatalogEntry]]:
"""Load catalog stack configuration from a YAML file.

Expand Down Expand Up @@ -1360,10 +1376,7 @@ def _fetch_single_catalog(self, entry: PresetCatalogEntry, force_refresh: bool =
pass

try:
import urllib.request
import urllib.error

with urllib.request.urlopen(entry.url, timeout=10) as response:
with self._open_url(entry.url, timeout=10) as response:
catalog_data = json.loads(response.read())

if (
Expand Down Expand Up @@ -1456,10 +1469,7 @@ def fetch_catalog(self, force_refresh: bool = False) -> Dict[str, Any]:
pass

try:
import urllib.request
import urllib.error

with urllib.request.urlopen(catalog_url, timeout=10) as response:
with self._open_url(catalog_url, timeout=10) as response:
catalog_data = json.loads(response.read())

if (
Expand Down Expand Up @@ -1578,7 +1588,6 @@ def download_pack(
Raises:
PresetError: If pack not found or download fails
"""
import urllib.request
import urllib.error

pack_info = self.get_pack_info(pack_id)
Expand Down Expand Up @@ -1620,7 +1629,7 @@ def download_pack(
zip_path = target_dir / zip_filename

try:
with urllib.request.urlopen(download_url, timeout=60) as response:
with self._open_url(download_url, timeout=60) as response:
zip_data = response.read()

zip_path.write_bytes(zip_data)
Expand Down
Loading