sid 8e550b9785 Local fork: hardening + ops improvements (timeout knob, demotion, /livez, drain)

This commit captures both the prior accumulated work-in-progress
(framework migration web/→svelte/, postgres storage, conversation
viewer, dashboard auth, OpenAPI spec, integration tests) AND today's
operational improvements layered on top. History wasn't checkpointed
incrementally; happy to split it via interactive rebase if a reviewer
wants smaller commits.

Today's changes (in addition to the older WIP):

1. Configurable upstream response-header timeout
   - ANTHROPIC_RESPONSE_HEADER_TIMEOUT env (default 300s)
   - Replaces hardcoded 300s in provider/anthropic.go that was firing
     on opus + 1M-context + extended thinking non-streaming requests
   - Files: internal/config/config.go, internal/provider/anthropic.go

2. Structured forward-error diagnostic logging
   - When a forward to Anthropic fails, log a single key=value line
     with request_id, model, stream, body_bytes, has_thinking,
     anthropic_beta, query, elapsed, ctx_err — alongside the existing
     human-readable error line for back-compat
   - Files: internal/handler/handlers.go (logForwardFailure)

3. Full SSE protocol passthrough + Flusher fix
   - handler/handlers.go: forward all SSE lines verbatim (event:, id:,
     retry:, : comments, blank-line terminators), not only data:.
     Previous code produced malformed SSE for strict parsers.
   - middleware/logging.go: explicit Flush() method on responseWriter.
     Embedding http.ResponseWriter (interface) does not auto-promote
     Flush(), so every w.(http.Flusher) check in the streaming
     handler was returning ok=false and SSE writes buffered in net/http
     until the body closed.

4. Non-streaming → streaming demotion (feature-flagged)
   - ANTHROPIC_DEMOTE_NONSTREAMING env (default false)
   - When enabled and the routed provider is anthropic, force stream=true
     upstream for clients that asked for stream=false. Receive SSE,
     accumulate via accumulateSSEToMessage (handles text, tool_use with
     partial_json reassembly, thinking, signature, citations_delta,
     usage merge), and synthesize a single non-streaming JSON response.
   - Eliminates the ResponseHeaderTimeout class of failure entirely.
   - Body rewrite uses json.Decoder + UseNumber() to preserve integer
     precision in unknown nested fields (tool inputs from prior turns).
   - Files: internal/config/config.go, internal/handler/handlers.go,
     cmd/proxy/main.go, cmd/proxy/main_test.go

5. Live operational state: /livez gauge + graceful drain
   - New internal/runtime package: atomic in-flight counter + draining flag
   - New middleware/inflight.go: increments runtime gauge, applied to
     /v1/* subrouter so Messages, ChatCompletions, and ProxyPassthrough
     are all counted
   - /v1/* moved to a gorilla/mux subrouter so the InFlight middleware
     applies surgically; /health, /livez, /openapi.* remain on parent
     router (unauthenticated, uncounted)
   - Health handler returns 503 draining when runtime.IsDraining() is
     true, so Traefik stops routing to a slot before drain begins
   - New /livez handler returns {status, in_flight, draining, timestamp}
   - SIGTERM handler in main.go: SetDraining(true), poll for in_flight==0
     with 32-min ceiling and 1s tick (logs every 10s), then srv.Shutdown
   - Auth bypass list extended with /livez
   - Files: internal/runtime/runtime.go (new),
     internal/middleware/inflight.go (new),
     internal/middleware/auth.go,
     internal/handler/handlers.go (Health, Livez, runtime import),
     cmd/proxy/main.go (subrouter, drain loop)

6. OpenAPI spec updates
   - Document Health 503 response and new DrainingResponse schema
   - Add /livez path with LivezResponse schema
   - Files: internal/handler/openapi.go

Verified: go build ./... clean, go test ./... all pass, go vet clean.
Three rounds of codex peer review across changes 1-5; all feedback
addressed (citations_delta, json.Number precision, drain-loop logging
via lastLog timestamp, PathPrefix tightened to "/v1/").

2026-05-02 15:15:58 -06:00

13 KiB

Raw Blame History

Claude Code Proxy

A transparent proxy for capturing and visualizing in-flight Claude Code requests and conversations, with optional agent routing to different LLM providers.

What It Does

Claude Code Proxy serves three main purposes:

Claude Code Proxy: Intercepts and monitors requests from Claude Code (claude.ai/code) to the Anthropic API, allowing you to see what Claude Code is doing in real-time
Conversation Viewer: Displays and analyzes your Claude API conversations with a beautiful web interface
Agent Routing (Optional): Routes specific Claude Code agents to different LLM providers (e.g., route code-reviewer agent to GPT-4o)

Features

Transparent Proxy: Routes Claude Code requests through the monitor without disruption
Agent Routing (Optional): Map specific Claude Code agents to different LLM models
Request Monitoring: SQLite-based logging of all API interactions
Live Dashboard: Real-time visualization of requests and responses
Conversation Analysis: View full conversation threads with tool usage
Easy Setup: One-command startup for both services

Security Defaults

The proxy currently defaults to 0.0.0.0:3001, but startup validation refuses non-loopback binds unless you either set AUTH_ENABLED=true with AUTH_TOKEN or explicitly opt into TRUST_PROXY=true for reverse-proxy deployments.
CORS is configurable and currently defaults to permissive values unless you override it.
If you expose the proxy directly on a public interface, enable auth and provide a token.
When auth is enabled, the proxy accepts either Authorization: Bearer <token> or X-API-Key: <token>.
Dashboard routes can be protected separately with DASHBOARD_PASSWORD, which enables HTTP basic auth for the web UI and dashboard data endpoints.

Quick Start

Prerequisites

Option 1: Go 1.20+ and Node.js 18+ (for local development)
Option 2: Docker (for containerized deployment)
Claude Code

Installation

Option 1: Local Development

Clone the repository

git clone https://github.com/seifghazi/claude-code-proxy.git
cd claude-code-proxy

Configure the proxy
```
cp config.yaml.example config.yaml
```

Install and run (first time)

make install  # Install all dependencies
make dev      # Start both services

Subsequent runs (after initial setup)
```
make dev
# or
./run.sh
```

Option 2: Docker

Clone the repository

git clone https://github.com/seifghazi/claude-code-proxy.git
cd claude-code-proxy

Configure the proxy

cp config.yaml.example config.yaml
# Edit config.yaml as needed

Build and run with Docker

# Build the image
docker build -t claude-code-proxy .

# Run locally without publishing ports
docker run claude-code-proxy

# Run with published ports
docker run -p 3001:3001 -p 5174:5174 \
  -e SERVER_HOST=0.0.0.0 \
  -e AUTH_ENABLED=true \
  -e AUTH_TOKEN=change-me \
  claude-code-proxy

Run with persistent data and custom configuration

# Create a data directory for persistent SQLite database
mkdir -p ./data

# Option 1: Run with config file (recommended)
# If you expose the container with `-p`, set server.host to 0.0.0.0
# and enable auth in the mounted config file.
docker run -p 3001:3001 -p 5174:5174 \
  -v ./data:/app/data \
  -v ./config.yaml:/app/config.yaml:ro \
  claude-code-proxy

# Option 2: Run with environment variables
docker run -p 3001:3001 -p 5174:5174 \
  -v ./data:/app/data \
  -e SERVER_HOST=0.0.0.0 \
  -e ANTHROPIC_FORWARD_URL=https://api.anthropic.com \
  -e AUTH_ENABLED=true \
  -e AUTH_TOKEN=change-me \
  -e PORT=3001 \
  claude-code-proxy

Docker Compose (alternative)

# docker-compose.yml
version: '3.8'
services:
  claude-code-proxy:
    build: .
    ports:
      - "3001:3001"
      - "5174:5174"
    volumes:
      - ./data:/app/data
      - ./config.yaml:/app/config.yaml:ro  # Mount config file
    environment:
      - SERVER_HOST=0.0.0.0
      - ANTHROPIC_FORWARD_URL=https://api.anthropic.com
      - AUTH_ENABLED=true
      - AUTH_TOKEN=change-me
      - PORT=3001
      - DB_PATH=/app/data/requests.db

Then run: docker-compose up

Using with Claude Code

To use this proxy with Claude Code, set:

export ANTHROPIC_BASE_URL=http://localhost:3001

Then launch Claude Code using the claude command.

This will route Claude Code's requests through the proxy for monitoring.

Access Points

Web Dashboard: http://localhost:5174
API Proxy: http://localhost:3001
Health Check: http://localhost:3001/health

Advanced Usage

Running Services Separately

If you need to run services independently:

# Run proxy only
make run-proxy

# Run Svelte dashboard only (in another terminal)
make run-svelte

Available Make Commands

make install    # Install all dependencies
make build      # Build both services
make dev        # Run in development mode
make test-proxy # Run Go proxy tests
make clean      # Clean build artifacts
make db-reset   # Reset database
make help       # Show all commands

Running Regression Tests

The proxy test suite lives under build/proxy:

cd build/proxy
go test ./...

Or from the build/ directory:

make test-proxy

Running Postgres Storage Contract Tests

The storage layer has a backend-agnostic contract suite. SQLite runs in the normal Go test path, and PostgreSQL can be exercised by setting TEST_POSTGRES_DSN:

cd build/proxy
TEST_POSTGRES_DSN='postgresql://user:password@localhost:5432/dbname?sslmode=disable' \
  go test ./internal/service -run TestPostgresStorageContract -count=1

Or from build/:

TEST_POSTGRES_DSN='postgresql://user:password@localhost:5432/dbname?sslmode=disable' \
  make test-proxy-postgres-contract

The test resets the requests and settings tables between runs, so point it at a disposable database.

Disposable Postgres Test Database

The repo also includes a dedicated Compose file for contract tests:

cd build
make test-proxy-postgres

That target:

starts ../docker-compose.test.yml
points TEST_POSTGRES_DSN at the disposable database by default
runs TestPostgresStorageContract
tears the database down automatically
removes orphaned test-compose containers for a clean rerun

If you want to manage the database lifecycle yourself:

cd build
make test-proxy-postgres-up
make test-proxy-postgres-contract
make test-proxy-postgres-down

Configuration

Basic Setup

Create a config.yaml file (or copy from config.yaml.example):

server:
  host: 127.0.0.1
  port: 3001

providers:
  anthropic:
    base_url: "https://api.anthropic.com"
    
  openai: # if enabling subagent routing
    api_key: "your-openai-key"  # Or set OPENAI_API_KEY env var

storage:
  db_path: "requests.db"

auth:
  enabled: false
  token: ""

If you set server.host to a non-loopback address such as 0.0.0.0, the proxy will refuse to start unless you also enable auth or explicitly set TRUST_PROXY=true for a reverse-proxy deployment.

Auth

To expose the proxy beyond localhost, enable auth and provide a token:

auth:
  enabled: true
  token: "change-me"

Then send either:

curl -H "Authorization: Bearer change-me" http://localhost:3001/v1/models

or:

curl -H "X-API-Key: change-me" http://localhost:3001/v1/models

Subagent Configuration (Optional)

The proxy supports routing specific Claude Code agents to different LLM providers. This is an optional feature that's disabled by default.

Enabling Subagent Routing

Enable the feature in config.yaml:

subagents:
  enable: true  # Set to true to enable subagent routing
  mappings:
    code-reviewer: "gpt-4o"
    data-analyst: "o3"
    doc-writer: "gpt-3.5-turbo"

Set up your Claude Code agents following Anthropic's official documentation:
- 📖 Claude Code Subagents Documentation
How it works: When Claude Code uses a subagent that matches one of your mappings, the proxy will automatically route the request to the specified model instead of Claude.

Practical Examples

Example 1: Code Review Agent → GPT-4o

# config.yaml
subagents:
  enable: true
  mappings:
    code-reviewer: "gpt-4o"

Use case: Route code review tasks to GPT-4o for faster responses while keeping complex coding tasks on Claude.

Example 2: Reasoning Agent → O3

# config.yaml
subagents:
  enable: true
  mappings:
    deep-reasoning: "o3"

Use case: Send complex reasoning tasks to O3 while using Claude for general coding.

Example 3: Multiple Agents

# config.yaml
subagents:
  enable: true
  mappings:
    streaming-systems-engineer: "o3"
    frontend-developer: "gpt-4o-mini"
    security-auditor: "gpt-4o"

Use case: Different specialists for different tasks, optimizing for speed/cost/quality.

Environment Variables

Override config via environment:

PORT - Server port
SERVER_HOST - Server bind host
TRUST_PROXY - Skip direct-bind auth enforcement when running behind Traefik or another reverse proxy
AUTH_ENABLED - Enable auth for non-health endpoints
AUTH_TOKEN - Shared auth secret
AUTH_API_KEY_HEADER - Header name for API key auth
AUTH_ALLOW_LOCALHOST_BYPASS - Allow localhost requests to bypass auth
DASHBOARD_PASSWORD - Protect the dashboard and dashboard data APIs with HTTP basic auth
OPENAI_API_KEY - OpenAI API key
DB_TYPE - Storage backend (sqlite or postgres)
DATABASE_URL - PostgreSQL connection string when DB_TYPE=postgres
DB_PATH - Database path
PROXY_PUBLIC_URL - Public proxy URL shown in dashboard setup instructions
SUBAGENT_MAPPINGS - Comma-separated mappings (e.g., "code-reviewer:gpt-4o,data-analyst:o3")

Docker Environment Variables

All environment variables can be configured when running the Docker container:

Variable	Default	Description
`SERVER_HOST`	`0.0.0.0`	Proxy bind host
`PORT`	`3001`	Proxy server port
`SVELTE_PORT`	`5174`	Dashboard server port
`READ_TIMEOUT`	`600`	Server read timeout (seconds)
`WRITE_TIMEOUT`	`600`	Server write timeout (seconds)
`IDLE_TIMEOUT`	`600`	Server idle timeout (seconds)
`ANTHROPIC_FORWARD_URL`	`https://api.anthropic.com`	Target Anthropic API URL
`ANTHROPIC_VERSION`	`2023-06-01`	Anthropic API version
`ANTHROPIC_MAX_RETRIES`	`3`	Maximum retry attempts
`TRUST_PROXY`	`false`	Allow reverse-proxy deployments without direct auth on the Go bind
`AUTH_ENABLED`	`false`	Enable auth for non-health endpoints
`AUTH_TOKEN`	`""`	Shared auth token
`AUTH_API_KEY_HEADER`	`x-api-key`	Header name for API-key style auth
`AUTH_ALLOW_LOCALHOST_BYPASS`	`true`	Allow loopback requests to bypass auth
`DASHBOARD_PASSWORD`	`""`	HTTP basic auth password for the dashboard
`DB_TYPE`	`sqlite`	Storage backend
`DATABASE_URL`	`""`	PostgreSQL connection string
`DB_PATH`	`/app/data/requests.db`	SQLite database path
`PROXY_PUBLIC_URL`	`""`	Public proxy URL shown by the Svelte dashboard

Example with custom configuration:

docker run -p 3001:3001 -p 5174:5174 \
  -v ./data:/app/data \
  -e SERVER_HOST=0.0.0.0 \
  -e AUTH_ENABLED=true \
  -e AUTH_TOKEN=change-me \
  -e ANTHROPIC_FORWARD_URL=https://api.anthropic.com \
  -e DB_PATH=/app/data/custom.db \
  claude-code-proxy

Project Structure

claude-code-proxy/
├── proxy/                  # Go proxy server
│   ├── cmd/               # Application entry points
│   ├── internal/          # Internal packages
│   └── go.mod            # Go dependencies
├── svelte/                # SvelteKit dashboard
│   ├── src/              # Svelte application
│   └── package.json      # Node dependencies
├── shared/                # Shared TypeScript modules used by the dashboard
├── run.sh                # Start script
├── .env.example          # Environment template
└── README.md            # This file

Features in Detail

Request Monitoring

All API requests logged to SQLite database
Searchable request history
Request/response body inspection
Conversation threading

Web Dashboard

Real-time request streaming
Interactive request explorer
Conversation visualization
Performance metrics

License

MIT License - see LICENSE for details.

13 KiB Raw Blame History