Building a Claude Traffic Proxy in One Session

January 2026

← Back to blog

Network diagram showing an app connecting through a central proxy hub to an API endpoint

I wanted to track how much my Claude API usage was actually costing me. Not the billing page estimate - the real cost. Per request. Per task. Per tool call.

So I built Langley: an intercepting proxy that captures every Claude API request, extracts token usage, calculates costs, and shows it all in real-time. In one coding session.

The Problem

Claude's billing shows monthly totals. Helpful, but useless for:

I needed request-level visibility. Something that sits between my code and Claude, captures everything, and gives me analytics.

The Architecture

Langley is a TLS-intercepting proxy. Traffic flows through it transparently:

Your App -> HTTPS -> Langley -> HTTPS -> Claude API
                        |
                        v
                   SQLite DB
                        |
                        v
                   Dashboard

It generates certificates on-the-fly, captures request/response pairs, parses Claude's SSE streams, extracts token counts, and calculates costs using a pricing table.

The dashboard shows:

What Made It Work

1. Security From the Start

Before writing code, we did a security analysis. Matt (our auditor persona) found 10 issues to address:

These weren't afterthoughts - they shaped the design.

2. Phased Implementation

We broke the work into phases:

Phase Deliverable
0 Basic HTTP proxy that forwards requests
1 TLS interception, SQLite persistence
2 REST API, WebSocket server, basic UI
3 Token extraction, cost calculation, analytics
4 Full dashboard with filtering and charts
5 Polish, documentation, blog

Each phase built on the last. Each had a clear deliverable.

3. Right-Sized Technology

No Kubernetes. No Postgres. No microservices. Just the minimum to solve the problem.

The Tricky Parts

SSE Parsing

Claude's streaming API uses Server-Sent Events. Token counts come in message_start and message_delta events, scattered across the stream. The parser accumulates them correctly:

case "message_start":
    // Extract input tokens from initial message
    if usage := msg["usage"]; usage != nil {
        flow.InputTokens = usage["input_tokens"]
    }

case "message_delta":
    // Extract output tokens from final delta
    if usage := event["usage"]; usage != nil {
        flow.OutputTokens = usage["output_tokens"]
    }

Task Grouping

Requests don't come with "task" labels. We infer them:

  1. Explicit X-Langley-Task header (if you add it)
  2. User ID from the request body's metadata
  3. Same host with 5-minute gap (new task starts)

This groups related requests together for per-task analytics.

Anomaly Detection

The system flags:

These help catch runaway loops and inefficient prompts.

The Result

Langley is about 2,000 lines of Go and 600 lines of React. It:

All without requiring any changes to your Claude client code. Just set HTTPS_PROXY and you're capturing.

What I Learned

Plan before code. We spent time on a security analysis and phased plan before writing implementation code. The plan survived contact with reality - the phases worked as scoped.

Simple architecture wins. SQLite handles everything. No external dependencies. Deploys as a single binary (once built with embedded frontend).

Real-time matters. The WebSocket updates make debugging feel immediate. Polling would have worked but felt sluggish.

Try It

The code is at github.com/HakAl/langley.

# Build
go build -o langley ./cmd/langley

# Trust the CA (see langley -show-ca)
# Run
./langley

# Set proxy
export HTTPS_PROXY=http://localhost:9090

# Open dashboard
# http://localhost:9091

Now you can see exactly what Claude is doing with your tokens.