GitHub recon

01 — GitHub Recon: Claude Code Transcript Indexing/Search Prior Art

Date: 2026-05-11 Researcher: Claude Opus 4.7 (Mac) Scope: Existing OSS tools that index, search, sync, or analyze Claude Code ~/.claude/projects/*.jsonl transcripts. Adjacent space: ChatGPT/Anthropic SDK conversation exports.

Problem we are buying or building for:

Index Claude Code transcripts from 5 machines (Mac, PC, iMac, Cheesegrater, Clarvis) into a single queryable store.
Search by content, variant, date, tool, cwd.
Run automated drift detection and organization passes (i.e. detect persona/role drift across sessions, surface recurring failure patterns, group sessions by topic/client).

1. Top Candidates — Adopt / Fork / Steal-From Verdicts

A. `marcelocantos/mnemo` — strongest fit for the search half

URL: https://github.com/marcelocantos/mnemo
Stars: 0 (new, but Apache-2.0, very active — last commit 2026-05-11)
Stack: Go binary, SQLite FTS5, embedded web dashboard on port 19419, MCP HTTP server, filesystem watcher
One-line: persistent MCP server that watches ~/.claude/projects/, indexes transcripts + git commits + GitHub PRs + project docs + skills + CLAUDE.md, exposes 30+ MCP tools for searching past sessions, plus a built-in dashboard with token analytics and live session monitoring.
Verdict: Adopt or steal the schema. Closer to a feature-complete answer than anything else surfaced. Already does session-chain detection across /clear boundaries, context-compaction summaries, decision extraction, image OCR, and image embeddings. The “0 stars” is misleading — it is feature-rich and the README reads like a finished product. The gotcha: it is single-machine. You would still need fleet-aware sync (rsync transcripts to one host and point mnemo at that root, or run one mnemo per machine and query each). No support for “drift detection” out of the box but the SQL surface is exposed.
Decision: Stand it up on Cheesegrater pointed at a rsync’d union of all five machines’ ~/.claude/projects/. Use it as the read-side. Build drift detection on top of its SQL surface.

B. `wesm/agentsview` — strongest fit for the analytics half

URL: https://github.com/wesm/agentsview
Stars: 928, forks 123, MIT, last commit 2026-05-11. Author is Wes McKinney (pandas creator) — high signal.
Stack: Go binary, SQLite, optional Postgres push for team dashboards, web UI on 127.0.0.1:8080, SSE live updates, Docker image
One-line: local-first session intelligence and cost analytics for 24 different coding agents (Claude Code, Codex, Copilot CLI, Cursor, Gemini CLI, Cline, Aider, OpenCode, Warp, etc.), with full-text search, activity heatmaps, archetype classification, cache-aware cost calc, and HTML/Gist export.
Verdict: Adopt for analytics and the cross-agent unified schema. Has a pg push command for team dashboards — exactly the multi-machine pattern we need. Schema is versioned (v1) and JSON-output stable, so downstream tools (our drift pass) can consume it. Active commercial-grade development. If we run agentsview pg push from each of the 5 machines into one Postgres on Cheesegrater, we get a unified read model with zero code on our side.
Decision: Run as the canonical analytics/usage layer on Cheesegrater. The pg push flow is the multi-machine answer.

C. `jhlee0409/claude-code-history-viewer` — best human-facing viewer

URL: https://github.com/jhlee0409/claude-code-history-viewer
Stars: 1,236, forks 124, MIT, last commit 2026-05-11
Stack: Tauri (Rust + React) desktop app + optional cchv-server headless mode (browser, Docker, systemd), supports 9 assistants
One-line: polished desktop/web app for browsing Claude Code + 8 other AI agent conversation histories.
Verdict: Use as-is for human browsing, do not extend. It is a viewer, not a query/sync platform. The headless cchv-server mode is useful — you could run it once on Cheesegrater and access via Tailscale from any machine. No drift detection, no programmatic API beyond what the web UI exposes. Best supplement to mnemo+agentsview for human review.
Decision: Optional polish layer. Install cchv-server on Cheesegrater for visual review of any single session. Not part of the indexing pipeline.

D. `Alfredvc/cct` — best SQL-first model

URL: https://github.com/Alfredvc/cct
Stars: 3, forks 1, MIT/Apache, last commit 2026-05-08
Stack: Rust binary, DuckDB, embedded React viewer at localhost:8766, ships with agent skills for Claude
One-line: ingests transcripts to DuckDB and ships skills that teach Claude how to query the DB in plain English (e.g. “what did I spend on Opus last week” → SQL → answer). Includes optimization playbook skill.
Verdict: Steal the skill pattern. The skill-as-investigation-playbook idea is the cleanest path to “drift detection” — write a drift-detection skill that runs SQL against either mnemo or agentsview’s DB. The crate-published parser (claude-code-transcripts on crates.io) is a strongly-typed Rust parser with round-trip validation for catching JSONL schema drift; that is worth borrowing even if we use a different DB.
Decision: Don’t run cct itself; copy the skill pattern and the typed-parser approach.

E. `lee-fuhr/claude-session-index` — closest to “drift detection on top of FTS”

URL: https://github.com/lee-fuhr/claude-session-index
Stars: 20, forks 1, MIT, last commit 2026-04-30
Stack: Python, SQLite FTS5, Claude Code skill, Anthropic Haiku for in-session synthesis (no extra API cost)
One-line: index + skill that lets you ask Claude “what have I tried for X, what worked, what failed” and get a synthesized answer with claude --resume jump links.
Verdict: Closest semantic match to “automated drift detection” use case. The cross-session synthesis via in-session Haiku subagent is exactly the pattern for drift detection (compare persona behavior across N sessions, flag deviations). Single-machine. Python deps. License is clean.
Decision: Strong steal candidate for the drift-detection module specifically. Even if we use mnemo/agentsview for storage, lee-fuhr’s synthesis recipe is the pattern.

2. Full Inventory

2.1 Indexers / Search / Memory (the core ask)

Repo	Stars	License	Last Commit	Maturity	Fit	Notes
marcelocantos/mnemo	0	Apache-2.0	2026-05-11	Production-shaped, new release	Full fit	MCP + dashboard + FTS5 + decisions extraction + image OCR. See top candidates.
wesm/agentsview	928	MIT	2026-05-11	Production	Full fit (analytics)	24-agent support, PG sync, web UI. See top candidates.
lee-fuhr/claude-session-index	20	MIT	2026-04-30	Production	Partial — synthesis	Python, SQLite FTS5, cross-session synthesis via Haiku.
lokkju/claude-session-index	0	Other	2026-03-30	Production-shaped	Partial — indexer + MCP	Python, SQLite FTS5, MCP stdio, plugin with auto-index hooks, subagent parent linking. Diff schema from lee-fuhr.
Jeremyx000/claude-session-index	1	MIT	2026-05-11	Demo/fork	No fit	Probably a fork of lee-fuhr’s.
Alfredvc/cct	3	MIT/Apache	2026-05-08	Production	Partial — DuckDB + skills	See top candidates.
spences10/ccrecall	13	MIT	2026-05-11	Production	Partial — clean schema	TS/Vite, SQLite, incremental sync, team/swarm tables, exposed raw SQL.
ahmedelgabri/ccpeek	29	MIT	2026-04-28	Production	Partial — broad index	Go + web UI, indexes conversations + plans + todos + shell snapshots + file history + paste cache + memories + commands. Wider scope than just transcripts.
kateleext/deja	12	MIT	2026-04-09	Production	Partial — episodic memory	Skill-based, todos-become-episodes structure, ranks by signal strength + recency. Different mental model.
Kaidorespy/memory	0	MIT	2026-04-23	Demo	No fit	One-off “extract memories” tool.
yudppp/claude-code-history-mcp	10	—	2026-03-01	Demo	Partial	MCP server for searching CC history. Less feature-rich than mnemo.
mblode/claude-code-search	5	MIT	2026-02-14	Demo	No fit	Small CLI search. Superseded by lee-fuhr/lokkju.
randlee/claude-history	3	MIT	2026-05-11	In-progress	No fit	CLI only.
S2thend/claude-code-history	2	—	2026-04-04	Demo	No fit	CLI + TS lib for browse/search/export.
sthadka/claude-code-history	0	—	2026-02-16	Demo	Partial — backup pattern	Python backup tool with structured exports and manifest-based incremental updates. The manifest pattern is worth borrowing for multi-machine sync.
Ayjc/claude-code-history	0	—	2026-01-28	Demo	No fit	Fuzzy autocomplete only.
john-parsneau/transcript-saver-mcp	0	Apache-2.0	2026-01-05	Demo	No fit	Saves running transcripts to markdown.

2.2 Viewers / Renderers (read-only UI)

Repo	Stars	License	Last Commit	Maturity	Fit	Notes
simonw/claude-code-transcripts	1,507	Apache-2.0	2026-05-11	Production	No fit (renderer only)	Simon Willison’s HTML transcript publisher. Useful for sharing specific sessions, not for indexing. Web-session API features currently broken (issue #77).
daaain/claude-code-log	1,014	MIT	2026-05-11	Production	Partial — TUI + filters	Python CLI with TUI for browsing, HTML/MD generators, date filtering, token tracking. Closest “polished CLI viewer.” Worth using as a daily-driver inspection tool.
jhlee0409/claude-code-history-viewer	1,236	MIT	2026-05-11	Production	Partial — viewer	See top candidates.
kylesnowschwartz/tail-claude	136	MIT	2026-05-06	Production	No fit	Bubble Tea Go TUI, live tailing of a single session.
vtemian/claude-notes	77	MIT	2026-04-24	Production	No fit	Terminal + HTML renderer.
philipp-spiess/claude-code-viewer	20	—	2026-05-10	In-progress	No fit	Web upload viewer.
yashagldit/Claude-Code-History-VSCode	29	Other	2026-05-11	Production	No fit	VSCode extension.
yanicklandry/claude-code-history-viewer	7	—	2026-04-02	Demo	No fit	Smaller alt viewer.
huangyunbin/claude-code-history-viewer	10	—	2026-04-22	Demo	No fit	Webpage viewer.
ClickHouse/alexeyprompts	11	Other	2026-04-19	Demo	Partial — CH backend	Official ClickHouse-team viewer using ClickHouse columnar store. Interesting for scale; overkill for 5 machines.
frederikb96/orbit	0	MIT	2026-03-02	Demo	No fit	Real-time streaming viewer.
caioaao/claude-code-session-visualizer	1	—	2026-02-14	Demo	No fit	Quick-and-dirty web UI.
xlvlxn/claude-session-viewer	0	—	2026-04-19	Demo	No fit	Drop-file GitHub Pages viewer.
tlskins/claude-log-viewer	0	—	2026-01-13	Demo	No fit	View + share.
offpolicy/cc-transcripts	0	—	2026-04-29	Demo	No fit	Conversation-format viewer.
limbooo/ClaudeCodeHistoryViewer	1	MIT	2025-12-08	Demo	No fit	Single-file web app.
ianhandy/claude-session-dashboard	0	—	2026-05-06	Demo	No fit	Token burn/cost dashboard.
ianhandy/multi-agent-session-visualizer	0	—	2026-05-06	Demo	Partial — multi-agent flow	D3.js flow diagrams, multi-agent support. Interesting visualization layer for fleet sessions.

2.3 Cost / Usage Analytics (ccusage family)

Repo	Stars	License	Last Commit	Maturity	Fit	Notes
ryoppippi/ccusage	14,042	Other	2026-05-11	Production	Partial — cost only	The canonical token/cost analyzer. JSONL-based. Use as the cost baseline; do not extend.
cobra91/better-ccusage	69	Other	2026-05-10	Production	No fit	ccusage fork with multi-provider support.
hydai/ccstat	8	MIT	2026-02-21	In-progress	No fit	Rust rewrite of ccusage.
SDpower/ccusage_go	14	MIT	2026-05-09	In-progress	No fit	Go rewrite of ccusage.
m6k/ccusage-py	3	—	2026-04-22	Demo	No fit	Python port.
elct9620/ccmon	4	Apache-2.0	2026-03-19	Demo	No fit	ccusage-inspired monitor.
~20 menubar/statusbar/tmux/neovim/vscode wrappers around ccusage	3–42	mixed	varies	mixed	No fit	Surface-layer tooling, not indexers. (e.g. `cctray`, `ccowl`, `AgentLimits`, `vibepulse`, `ccusage.nvim`, etc.)

2.4 Specialty / Adjacent Tools

Repo	Stars	License	Last Commit	Maturity	Fit	Notes
ymonster/cc_jsonl_fix	0	MIT	2026-05-01	Demo	Partial — repair	Repairs broken `parentUuid` chains. Worth keeping as a recovery tool.
Lightcone-ZhangYifa/claude-replay-plugin	0	MIT	2026-05-07	Demo	Partial — byte-perfect replay	Recovers lost git history from JSONLs, supports project resurrection and AI-agent behavior analysis. Interesting for the “drift” angle — analyzing what an agent actually did vs what it should have done.
achiii800/claude-snap	1	MIT	2026-05-11	In-progress	No fit	Portable snapshot codec for moving sessions between machines — could be useful as the wire format for multi-machine sync.
krzemienski/session-insight-miner	0	MIT	2026-05-02	Demo	No fit	Per-session cost/cache reports.
Joopsnijder/claude-code-transcripts	0	—	2026-04-22	Demo	No fit	Daily AI summaries.
knivram/cc-session-visualizer	1	—	2026-03-17	Demo	No fit	Bun CLI → HTML sequence diagrams.
mtschoen/claude-walker	0	—	2026-05-11	In-progress	No fit	Multi-language pace-walker w/ shared conformance corpus. Maybe useful for parser cross-checks.
nagstler/pr-narrator	2	Apache-2.0	2026-05-02	Demo	No fit	Transcript → PR description.
yanndebray/claude-code-scrubber	0	MIT	2026-03-04	Demo	No fit	PII scrub.
k33bs/ticktock	2	MIT	2026-05-09	Demo	No fit	Inline timestamps.
palimondo/xs	0	—	2026-03-05	Demo	No fit	Transcript exploration tools.
shinyaohtani/claude-trace	0	—	2026-04-16	Demo	No fit	JSONL → MD.
asdf8601/cc-gist-export	0	MIT	2026-04-18	Demo	No fit	Export to gist.
jonathanmejia4/claude-session-replay	0	MIT	2026-04-15	Demo	No fit	JSONL → transcript.
derekbreden/jsonl2md	0	—	2026-05-01	Demo	No fit	JSONL → MD.
davesque/claude-md-transcripts	0	MIT	2026-05-08	Demo	No fit	JSONL → MD + qmd index.
tayiorbeii/claude-code-transcript-cleanup-skill	1	—	2025-12-01	Demo	No fit	Cleanup skill for `/export` output.
prbe-ai/prbe-agent-tap	0	MIT	2026-04-28	Demo	No fit (SaaS)	Daemon that ships transcripts to a hosted SaaS (api.prbe.ai). Worth flagging — competing pattern.
jordangarrison/panko	0	MIT	2026-04-09	Demo	No fit	View + share.
elischwerzler/scribe	0	—	2026-04-24	Demo	No fit	Transcripts → searchable book.
ysamlan/agent-log-gif	5	Apache-2.0	2026-04-28	Demo	No fit	Transcripts → animated GIFs.

2.5 Memory / Long-Term-Memory MCPs (overlapping space)

Repo	Stars	License	Last Commit	Maturity	Fit	Notes
srijanshukla18/claude-memory-viz	99	—	2026-04-22	Production	No fit	Visualizer for the Claude memory MCP graph, not transcript-aware.
WhenMoon-afk/claude-memory-mcp	67	MIT	2026-05-04	Production	No fit	Generic MCP memory server, not transcript-indexed.
serkansmg/smg-claude-memory-mcp	18	—	2026-05-10	Production	Partial — vector	Per-project vector memory with rules enforcement and git-based team sharing. The git-sync pattern is relevant for fleet.
~25 other `claude-memory-mcp` named repos	0–5	mixed	varies	mostly demo	No fit	Generic note/memory stores. None reach into transcript JSONL.

3. Gap Analysis — What No Existing Tool Covers

Even if we adopt mnemo + agentsview + cct’s skill pattern + lee-fuhr’s synthesis recipe, the following are still on us to build:

Multi-machine sync. Every tool surveyed is single-host. We need to either:
- rsync ~/.claude/projects/ from all 5 machines into a canonical location on Cheesegrater on a schedule, then point one mnemo+agentsview at it, OR
- run mnemo/agentsview on each host and use agentsview’s pg push to consolidate into a single Postgres on Cheesegrater.
- The agentsview pg push path is the cleanest because it is already implemented and supported.
Fleet-aware variant identity. None of the indexers know about Wes’s PersonaName/canonical_name layer (Pepper, Stark, Lens, Bilby, etc.). They key on cwd and session ID. We need a join table from (machine, cwd, session_id) → canonical_variant_name so drift detection can ask “is Stark drifting” rather than “is session 7b22… drifting.” Cheapest fix: maintain that map in D1 / vault and join at query time.
Drift detection logic. Closest prior art: lee-fuhr’s “what have I tried for X” synthesis pass. No tool has explicit persona-drift, role-violation, or canon-divergence detection. This is genuinely novel and is our actual contribution. Pattern to steal: lee-fuhr’s in-session Haiku synthesizer over FTS5 hits.
Organization passes. No tool re-tags or restructures sessions automatically after the fact (e.g. “all sessions touching Covenant Wildlife in the last 30 days → tag as client:covenant, surface uncommitted decisions”). We build this on top of whatever DB we land on.
Cross-session decision diffing. mnemo extracts decisions, but nothing diffs decisions across sessions to flag contradictions (“session A: ‘we keep hines-mcp’, session B: ‘hines-mcp is dead’”). Build on top.
Hook-aware ingestion. Most tools assume passive JSONL parsing. Wes already has SessionEnd / PreCompact hooks writing handoffs and deep-transcript-scan artifacts. Ingestion should pick those up as authoritative summary inputs, not just the raw JSONL. None of the surveyed tools do this.
Live tailing with fleet-wide subscribe. mnemo has filesystem watching on one host; nothing watches across hosts. Cheapest answer: run mnemo on each host, broadcast new-session events through the existing claude-peers broker.

4. Recommended Architecture (preview, not the deliverable)

Hooking the above together:

[5 machines]
  ~/.claude/projects/*.jsonl
        │
        ├── (each machine) agentsview daemon → SQLite local
        │       └── nightly: `agentsview pg push` → Postgres on Cheesegrater
        │
        ├── (Cheesegrater) one mnemo instance pointed at an rsync'd
        │       union of all machines' projects/ for deep search + decisions
        │
        └── (any host) cchv-server for human browsing
              ↓
[Cheesegrater Postgres + mnemo SQLite]
        ↓
[Drift-detection skill] (cct-pattern + lee-fuhr-pattern)
   reads from PG + mnemo, joins to variant-identity map in D1/vault,
   runs comparative passes, emits findings to vault.

Pick this apart in the next research pass.

5. Search Trail (so the next pass doesn’t redo work)

GitHub CLI searches executed:

gh search repos "claude-code transcript" → 30 results, primary hit set
gh search repos "claude-code session jsonl" → 16 results, included claude-snap, ymonster/cc_jsonl_fix, claude-replay-plugin
gh search repos "ccusage" → 30 results, mostly menubar wrappers
gh search repos "claude-code history" → 30 results, all the cchv variants
gh search repos "claude-code search MCP" → 0 results
gh search repos "claude session index" → 3 results (lee-fuhr, Jeremyx, lokkju)
gh search repos "ccmonitor OR ccmanager OR cchistory OR cc-history" → 0 results
gh search repos "claude memory MCP" → 30 results (mostly generic memory MCPs, not transcript-aware)
gh search repos "claude-code multi-machine sync" → 0 results
gh search repos "claude-code fleet" → 12 results (orchestration tools, not indexers)
gh search repos "openai-conversation-export OR chatgpt-history-export" → 0 results
gh search repos "claude-code analytics drift" → 0 results

READMEs read in full:

marcelocantos/mnemo
wesm/agentsview
jhlee0409/claude-code-history-viewer
daaain/claude-code-log
Alfredvc/cct
lee-fuhr/claude-session-index
lokkju/claude-session-index
spences10/ccrecall
kateleext/deja
simonw/claude-code-transcripts
kylesnowschwartz/tail-claude
ahmedelgabri/ccpeek

NOT yet investigated but flagged for follow-up:

ClickHouse/alexeyprompts (if scale becomes an issue)
achiii800/claude-snap (if we need a portable session wire format)
Lightcone-ZhangYifa/claude-replay-plugin (drift/behavior-analysis angle)
sthadka/claude-code-history (manifest-based incremental sync pattern)
The claude-code-transcripts Rust crate on crates.io (typed parser w/ round-trip validation)

PyPI / npm / crates.io NOT searched directly — every relevant package surfaces through its GitHub repo. Skip unless something specific looks missing.

6. One-Line Takeaway

Do not write a new indexer. The combination of agentsview (cross-machine via pg push) + mnemo (deep MCP search on a unified rsync mirror) + a skill on top of those two for drift detection (cct + lee-fuhr pattern) covers ~80% of the problem with already-shipped code. The only genuinely new code is the variant-identity join and the drift-pass logic.