TypeScript parser for Claude Code's local session data.
Get clean, typed analytics from your Claude Code sessions — tokens used, git branches, tools called, full transcripts. Reads the JSONL files Claude Code stores under ~/.claude/projects/.
npm install claude-code-data
import { readSession, readSessionIds } from 'claude-code-data';
const dir = '/Users/me/.claude/projects/-Users-me-my-project';
for (const id of await readSessionIds(dir)) {
const s = await readSession(dir, id);
const tokens = s.tokens.input + s.tokens.output;
console.log(`[${s.gitBranch}] ${s.firstPrompt} — ${tokens.toLocaleString()} tokens`);
}
// [feat/auth-refactor] Add JWT refresh token rotation — 87,432 tokens
// [fix/payment-bug] Stripe webhook fails on retry — 23,104 tokens
// [main] How do I set up the test suite — 12,301 tokens
readSession streams the JSONL once and returns aggregate fields only (tokens, branch, first prompt, …). Memory stays constant regardless of session size.
import { readSessionTurns } from 'claude-code-data';
for await (const turn of readSessionTurns(dir, sessionId)) {
render(turn);
}
readSessionTurns yields MessageEntry values from the main file as an async iterable. Memory stays bounded — useful for rendering long transcripts or mining content without holding everything at once.
import { readSubagentTurns } from 'claude-code-data';
const byAgent = new Map<string, MessageEntry[]>();
for await (const { agentId, turn } of readSubagentTurns(dir, sessionId)) {
const turns = byAgent.get(agentId) ?? [];
turns.push(turn);
byAgent.set(agentId, turns);
}
readSubagentTurns walks <sessionId>/subagents/ (including nested layouts) and yields each subagent turn tagged with its agentId. The caller groups, filters, or processes incrementally.
The walking strategy is up to you. Sequential, Promise.all, p-limit — whatever fits your workload:
const ids = await readSessionIds(dir);
const sessions = await Promise.all(ids.map((id) => readSession(dir, id)));
Claude Code stores conversations as JSONL files. Reading them correctly has many edge cases — different event types, subagent files in separate folders, transcripts replayed after /resume that double-count tokens.
This library handles all of that, so you can build analytics dashboards, session browsers, or scripts without reimplementing the parser.
~/.claude/projects/
└── <project-slug>/ ← a project = a working directory
└── <session-id>.jsonl ← a session = one conversation
└── <session-id>/subagents/agent-*.jsonl ← subagents (CC ≥ 2.1.2)
readSessionIds(projectDir) // string[] — session UUIDs
readSession(projectDir, sessionId) // ParsedSession — aggregate metadata
readSessionTurns(projectDir, sessionId) // AsyncGenerator<MessageEntry> — streamed main turns
readSubagentTurns(projectDir, sessionId) // AsyncGenerator<{ agentId, turn }> — streamed subagent turns
Four primitives, each does one thing. The library never decides for you whether to parse all sessions in parallel, sequentially, or with a concurrency limit — that orchestration belongs to the caller. See ParsedSession and MessageEntry for all available fields.