This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
tariff-everywhere is a Harmonized Tariff Schedule (HTS) lookup service built in Python. It downloads tariff classification data from the US International Trade Commission's public API, stores it in SQLite, and exposes it through two interfaces:
- CLI (
hts.py) — terminal-based lookups for developers - MCP Server (
mcp_server.py) — Model Context Protocol tools for AI agents
All development runs in Docker. No Python, pip, or virtualenv required on the host.
Before beginning any work, build the Docker image and verify the environment:
# 1. Build the image
docker build -t hts-local .
# 2. Run the test suite (all tests should pass)
docker run --rm hts-local -m pytest tests/ -v
# 3. Smoke test the CLI
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py --helpIf data/hts.db does not exist yet, run the ingest first:
docker run --rm -v "$(pwd)/data:/app/data" hts-local scripts/ingest.pyAfter building, verify the entire system works end-to-end:
# Run all tests
docker run --rm hts-local -m pytest tests/ -v
# Expected: 114 passed, 5 skipped
# Ingest data (if data/hts.db doesn't exist)
docker run --rm -v "$(pwd)/data:/app/data" hts-local scripts/ingest.py
# Expected: Database created with ~134K entries across 99 chapters
# Verify CLI works
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py chapters
# Expected: All 99 chapters listed with descriptions and entry counts
# Check refresh (should be fast if data is current)
docker run --rm -v "$(pwd)/data:/app/data" hts-local scripts/refresh.py
# Expected: "Already up to date" or new chapters ingested if data changedNote on ingest idempotency: The ingest script skips duplicate HTS codes. When re-running after data already exists, you'll see Loaded 0 entries but also Skipped 134019 duplicate HTS codes — this is normal and expected. The script won't duplicate data on subsequent runs.
- Source: https://hts.usitc.gov/reststop/ (public government API, no auth required)
- Schema: Three tables in
data/hts.db:chapters— HTS chapters (01-99), with descriptions, content hashes, and freshness timestamps (last_checked_at,last_changed_at)hts_entries— ~134K tariff entries with rates, units, indent level, footnotesdata_freshness— records of each refresh run (timestamp, duration, chapters changed)
- Indexes:
hts_code(exact lookups),description(substring search)
| File | Purpose |
|---|---|
scripts/ingest.py |
Download HTS data from API, iterate chapters 01-99, parse JSON, insert into SQLite |
scripts/refresh.py |
Detect HTS data changes by hashing all 99 chapters in parallel, re-ingest if changed, track per-chapter freshness |
hts.py |
CLI entrypoint (typer) with search, code, chapter, chapters, info commands |
mcp_server.py |
Expose five tools over MCP stdio: search_hts, get_code, list_chapter, get_chapters, get_data_freshness |
tariff_everywhere.py |
Public Python API with connection-managing wrappers for programmatic access |
- Database connections: Each command opens/closes a connection in a try-finally block. No connection pooling needed for CLI/MCP (low concurrency).
- Formatting: Two helper functions in
hts.py(format_entry_as_dict,format_entry_for_table) standardize output across CLI table views and JSON responses. - JSON output: CLI uses
print()(not Richconsole.print()) for all JSON output to avoid ANSI control character injection. Rich is only used for table display. - MCP tools: Return JSON strings (not objects), matching MCP SDK conventions. Tool docstrings are exposed as help text to Claude.
- Revision detection:
scripts/refresh.pyhashes all 99 chapters in parallel (ThreadPoolExecutor) and compares against stored hashes in thechapterstable. Since/reststop/releasesreturns 404, this content-hash approach is the alternative. Per-chapterlast_checked_atandlast_changed_attimestamps distinguish "we looked" from "it was different." - Docker entrypoint is
python: The Dockerfile usesENTRYPOINT ["python"], so all arguments passed todocker run ... hts-local <args>become arguments topython. Script paths (e.g.,scripts/ingest.py) work directly, but installed CLI tools likedatasettemust be invoked with-m(e.g.,-m datasette). This also means tools that shell out to external binaries (likedatasette publish flyneedingflyctl) won't work inside the container.
docker build -t hts-local .Ingest data (if data/hts.db doesn't exist):
docker run --rm -v "$(pwd)/data:/app/data" hts-local scripts/ingest.pyCLI usage (after ingest):
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py search "copper wire"
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py code 7408.11
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py chapter 74
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py info
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py info --chapter 74
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py --helpRefresh data (check for updates and re-ingest if changed):
docker run --rm -v "$(pwd)/data:/app/data" hts-local scripts/refresh.pyMCP server (stdio, for Claude Desktop integration):
docker run --rm -i -v "$(pwd)/data:/app/data" hts-local mcp_server.pydocker run --rm hts-local -m pytest tests/ -vThe test suite covers CLI commands, MCP server tools, and edge cases using an in-memory SQLite fixture. No real database or API access needed.
If you have Python 3.12+ installed:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Now test directly
python hts.py search "titanium"
python hts.py code 0101.21.00
# Or verify DB directly
python -c "import sqlite3; db = sqlite3.connect('data/hts.db'); print(db.execute('SELECT COUNT(*) FROM hts_entries').fetchone()[0])"- Add a
@app.command()function inhts.py - Use
typer.Argument()for positional args,typer.Option()for flags - Follow the pattern: connect to DB → execute query → format output (JSON or table) → close DB
- For
--jsonoutput, useprint()— neverconsole.print()(Rich injects control characters) - Update the table schema in both
format_entry_for_tableandformat_entry_as_dictif querying new columns - Add corresponding tests in
tests/test_cli.py
- Add a
@mcp.tool()function inmcp_server.py - Docstring becomes the tool description (shown to Claude)
- Always return a JSON string (
json.dumps()) - Follow the DB pattern: open → execute → format → close
- Handle errors gracefully (return JSON error object, don't raise)
- Add corresponding tests in
tests/test_mcp.py
- Edit
tariff_everywhere.pyto expose connection-managing functions for external callers - Reuse
hts_core.pyfor SQL queries and row-to-dict conversion instead of duplicating query logic - Keep return values JSON-serializable dictionaries/lists so callers can export them directly
- Raise
FileNotFoundErrorwhen the SQLite database is missing; returnNoneor[]for not-found lookups - Add/update tests in
tests/test_python_api.py - Update
docs/PYTHON_API.mdand the README link when the public surface changes
- Edit
create_schema()inscripts/ingest.py - Re-run ingest to rebuild:
docker run --rm -v "$(pwd)/data:/app/data" hts-local scripts/ingest.py(will recreate tables) - Update column references in formatting functions if needed
# Count entries by chapter
docker run --rm -v "$(pwd)/data:/app/data" hts-local -c "
import sqlite3
db = sqlite3.connect('data/hts.db')
result = db.execute('SELECT COUNT(*) FROM hts_entries').fetchone()[0]
print(f'Total entries: {result}')
"
# Quick smoke test
docker run --rm -v "$(pwd)/data:/app/data" hts-local hts.py code 0101.21.00GET https://hts.usitc.gov/reststop/search?keyword=chapter%20XX&limit=5000
- No authentication required (public endpoint)
- Returns flat JSON array of entries
- All 99 chapters can be fetched in parallel (~15-20s for full ingest)
- The original plan endpoint (
/exportSections?format=JSON) is no longer operational
Each hts_entries row contains:
hts_code— tariff code (e.g., "7408.11.30")description— product descriptionindent— hierarchy level (0 = chapter, 1 = heading, etc.)unit— measurement unit (e.g., "kg", "liters")general_rate,special_rate,column2_rate— duty rates (strings like "5%", "Free")footnotes— JSON string of footnote objects (may be empty string)chapter_id— foreign key tochapterstable
Note: The CLI SELECT queries omit footnotes — the format_entry_as_dict column list has 9 columns. The MCP server queries 6 columns (no id, indent, footnotes, chapter_id).
The MCP server runs locally via Docker with stdio transport — it does not expose a network port and is not deployable to a remote service. This approach keeps the tariff database private and avoids cloud infrastructure.
To use the HTS tools in Claude Desktop:
{
"mcpServers": {
"hts": {
"command": "docker",
"args": [
"run", "--rm", "-i",
"-v", "/absolute/path/to/tariff-everywhere/data:/app/data",
"hts-local",
"mcp_server.py"
]
}
}
}The server uses stdio transport (no port exposed). Claude Desktop spawns the container, communicates over stdin/stdout, and the container exits cleanly when the session ends.
Why local-only? Remote HTTP deployment was explored but not pursued. Stdio transport over Docker is simpler, more secure (data never leaves your machine), and eliminates infrastructure overhead.
Live at: https://tariff-everywhere.fly.dev/
The database is published as a public Datasette instance for browsable web access.
metadata.json— Datasette configuration (titles, facets, label columns, descriptions)scripts/build_fts.py— Builds FTS5 full-text search index usingsqlite-utils(critical: must use sqlite-utils, not manual SQL, for Datasette to detect FTS)scripts/chapter_titles.py— Enricheschapters.descriptionwith real HTS titles ("Live Animals" instead of "Chapter 01")requirements.txt— Includes datasette, datasette-search-all, datasette-render-html, datasette-publish-fly, sqlite-utils
FTS5 Detection: Datasette only auto-detects FTS5 tables created by sqlite-utils. Manual SQL creation (with content_rowid= parameter) breaks Datasette's search. Always use:
import sqlite_utils
db = sqlite_utils.Database("data/hts.db")
db["hts_entries"].enable_fts(["description"], fts_version="fts5")Typer Compatibility: Typer 0.15.x breaks with click 8.3+ (signature change in Parameter.make_metavar()). Pin typer~=0.24.0 for click 8.3+ compatibility.
Chapter UX: The chapters table now uses label_column: "description" (real titles like "Copper and Articles Thereof") instead of just chapter numbers. A browse_chapters SQL view shows entry counts per chapter for easier navigation.
HTML Rendering: 1,535 entries have <i> tags for scientific names. The datasette-render-html plugin renders these correctly; without it, raw <i> text appears.
CI (recommended): The deploy-datasette.yml workflow handles the full pipeline — ingest, chapter title enrichment, FTS rebuild, and deploy. Trigger it manually via workflow_dispatch. Data prep steps run inside Docker (volume-mounted to the runner), but the deploy step runs directly on the runner because datasette publish fly requires flyctl, which isn't in the Docker image.
Manual deployment:
# 1. Update chapter titles and rebuild FTS
python3 scripts/chapter_titles.py data/hts.db
python3 -m sqlite_utils enable-fts data/hts.db hts_entries description --fts5 --replace
# 2. Deploy (requires flyctl auth login)
datasette publish fly data/hts.db \
--app="tariff-everywhere" \
--metadata metadata.json \
--install=datasette-search-all \
--install=datasette-render-html \
--setting default_page_size 50The deployment is automatic: image build (~52 MB), two machines provisioned on Fly.io free tier, zero cost.
- MCP server is local-only — No remote HTTP deployment. The MCP server only runs locally via Docker (stdio transport). This is by design: keeps data private, reduces infrastructure, and simplifies setup. If remote MCP access is needed, run Claude on the same machine as the Docker container.
- Single-threaded CLI — no parallel queries; acceptable for interactive lookups
- No pagination in CLI search — hardcoded limit of 10 results; use
--limitflag to increase - Revision detection is content-hash based —
scripts/refresh.pyhashes all 99 chapters to detect changes, but cannot distinguish USITC revision numbers (the API provides none) format_entry_as_dictcolumn mapping — uses positionalzipagainst hardcoded column names; fragile if the SELECT changes. Consider usingcursor.descriptionor named tuples.
Database locked error:
- Likely a stale connection. Ensure all CLI commands close the DB in a finally block.
- If Docker container hangs,
docker psto find the container ID, thendocker kill <id>.
"hts.db not found" error:
- Run the ingest script first to populate
data/hts.db.
MCP server not starting:
- Check that the data volume is mounted and readable:
docker run --rm -v "$(pwd)/data:/app/data" hts-local ls -la /app/data/
Slow searches:
- Add missing indexes if querying new columns; see
scripts/ingest.py:create_schema().