If your Claude Code API bill feels high, the cause is rarely "too many turns."
It's a handful of specific, measurable patterns. Each one is visible in the
session transcripts Claude Code already writes to
~/.claude/projects/ — you just have to add up the usage the API
reports per turn. Here are the big three.
The single most common waste pattern. The agent reads a large reference file, the conversation moves on, the file falls out of working context, and it reads the entire file again later. A 2,000-line file read five times is four needless full-file payloads.
Two fixes. For files the agent visits repeatedly, put a short summary plus key
line numbers in your CLAUDE.md so it stops re-discovering them. For
one-off lookups in big files, prefer a targeted grep -n followed by
reading just the matching range, instead of reading the whole thing.
Tool results are tokens too. A cat of a 5,000-line log, an
un-truncated npm test run, a full git diff of a
generated lockfile — each lands in context at full size. Worse, one giant result
can evict your prompt cache (see below), so you pay twice.
# instead of dumping everything:
cat huge.log # thousands of tokens
# scope it:
rg "ERROR|WARN" huge.log | head -50
npm test --silent 2>&1 | tail -30
Anthropic's prompt cache makes the repeated prefix of a conversation cheap — but only within a ~5-minute window, and only if the prefix is stable. Long idle gaps between turns let the cache expire. Huge tool results churn the context and evict the cached prefix. A healthy session reads 70%+ of its input tokens from cache; if you're well below that, batch your interactions and trim the output that's causing churn.
Every assistant turn in the transcript JSONL carries a usage
object with input_tokens, output_tokens,
cache_read_input_tokens, and cache_creation_input_tokens.
Sum them across sessions, attribute tool-result size by tool, and count how many
times each file was read. The waste becomes obvious fast — in real sessions it's
common to find 15–30% of input tokens going to re-reads and oversized results
alone.
token-audit — it scans your transcripts,
shows exactly where tokens went (per tool, per re-read file, cache hit rate),
and emits concrete config fixes.
See CC Powerpack Pro →
or start with the free hooks.