Skip to content

GitHub recon

01 — GitHub Recon: Claude Code Transcript Indexing/Search Prior Art

Section titled “01 — GitHub Recon: Claude Code Transcript Indexing/Search Prior Art”

Date: 2026-05-11 Researcher: Claude Opus 4.7 (Mac) Scope: Existing OSS tools that index, search, sync, or analyze Claude Code ~/.claude/projects/*.jsonl transcripts. Adjacent space: ChatGPT/Anthropic SDK conversation exports.

Problem we are buying or building for:

  1. Index Claude Code transcripts from 5 machines (Mac, PC, iMac, Cheesegrater, Clarvis) into a single queryable store.
  2. Search by content, variant, date, tool, cwd.
  3. Run automated drift detection and organization passes (i.e. detect persona/role drift across sessions, surface recurring failure patterns, group sessions by topic/client).

1. Top Candidates — Adopt / Fork / Steal-From Verdicts

Section titled “1. Top Candidates — Adopt / Fork / Steal-From Verdicts”

A. marcelocantos/mnemostrongest fit for the search half

Section titled “A. marcelocantos/mnemo — strongest fit for the search half”
  • URL: https://github.com/marcelocantos/mnemo
  • Stars: 0 (new, but Apache-2.0, very active — last commit 2026-05-11)
  • Stack: Go binary, SQLite FTS5, embedded web dashboard on port 19419, MCP HTTP server, filesystem watcher
  • One-line: persistent MCP server that watches ~/.claude/projects/, indexes transcripts + git commits + GitHub PRs + project docs + skills + CLAUDE.md, exposes 30+ MCP tools for searching past sessions, plus a built-in dashboard with token analytics and live session monitoring.
  • Verdict: Adopt or steal the schema. Closer to a feature-complete answer than anything else surfaced. Already does session-chain detection across /clear boundaries, context-compaction summaries, decision extraction, image OCR, and image embeddings. The “0 stars” is misleading — it is feature-rich and the README reads like a finished product. The gotcha: it is single-machine. You would still need fleet-aware sync (rsync transcripts to one host and point mnemo at that root, or run one mnemo per machine and query each). No support for “drift detection” out of the box but the SQL surface is exposed.
  • Decision: Stand it up on Cheesegrater pointed at a rsync’d union of all five machines’ ~/.claude/projects/. Use it as the read-side. Build drift detection on top of its SQL surface.

B. wesm/agentsviewstrongest fit for the analytics half

Section titled “B. wesm/agentsview — strongest fit for the analytics half”
  • URL: https://github.com/wesm/agentsview
  • Stars: 928, forks 123, MIT, last commit 2026-05-11. Author is Wes McKinney (pandas creator) — high signal.
  • Stack: Go binary, SQLite, optional Postgres push for team dashboards, web UI on 127.0.0.1:8080, SSE live updates, Docker image
  • One-line: local-first session intelligence and cost analytics for 24 different coding agents (Claude Code, Codex, Copilot CLI, Cursor, Gemini CLI, Cline, Aider, OpenCode, Warp, etc.), with full-text search, activity heatmaps, archetype classification, cache-aware cost calc, and HTML/Gist export.
  • Verdict: Adopt for analytics and the cross-agent unified schema. Has a pg push command for team dashboards — exactly the multi-machine pattern we need. Schema is versioned (v1) and JSON-output stable, so downstream tools (our drift pass) can consume it. Active commercial-grade development. If we run agentsview pg push from each of the 5 machines into one Postgres on Cheesegrater, we get a unified read model with zero code on our side.
  • Decision: Run as the canonical analytics/usage layer on Cheesegrater. The pg push flow is the multi-machine answer.

C. jhlee0409/claude-code-history-viewerbest human-facing viewer

Section titled “C. jhlee0409/claude-code-history-viewer — best human-facing viewer”
  • URL: https://github.com/jhlee0409/claude-code-history-viewer
  • Stars: 1,236, forks 124, MIT, last commit 2026-05-11
  • Stack: Tauri (Rust + React) desktop app + optional cchv-server headless mode (browser, Docker, systemd), supports 9 assistants
  • One-line: polished desktop/web app for browsing Claude Code + 8 other AI agent conversation histories.
  • Verdict: Use as-is for human browsing, do not extend. It is a viewer, not a query/sync platform. The headless cchv-server mode is useful — you could run it once on Cheesegrater and access via Tailscale from any machine. No drift detection, no programmatic API beyond what the web UI exposes. Best supplement to mnemo+agentsview for human review.
  • Decision: Optional polish layer. Install cchv-server on Cheesegrater for visual review of any single session. Not part of the indexing pipeline.
  • URL: https://github.com/Alfredvc/cct
  • Stars: 3, forks 1, MIT/Apache, last commit 2026-05-08
  • Stack: Rust binary, DuckDB, embedded React viewer at localhost:8766, ships with agent skills for Claude
  • One-line: ingests transcripts to DuckDB and ships skills that teach Claude how to query the DB in plain English (e.g. “what did I spend on Opus last week” → SQL → answer). Includes optimization playbook skill.
  • Verdict: Steal the skill pattern. The skill-as-investigation-playbook idea is the cleanest path to “drift detection” — write a drift-detection skill that runs SQL against either mnemo or agentsview’s DB. The crate-published parser (claude-code-transcripts on crates.io) is a strongly-typed Rust parser with round-trip validation for catching JSONL schema drift; that is worth borrowing even if we use a different DB.
  • Decision: Don’t run cct itself; copy the skill pattern and the typed-parser approach.

E. lee-fuhr/claude-session-indexclosest to “drift detection on top of FTS”

Section titled “E. lee-fuhr/claude-session-index — closest to “drift detection on top of FTS””
  • URL: https://github.com/lee-fuhr/claude-session-index
  • Stars: 20, forks 1, MIT, last commit 2026-04-30
  • Stack: Python, SQLite FTS5, Claude Code skill, Anthropic Haiku for in-session synthesis (no extra API cost)
  • One-line: index + skill that lets you ask Claude “what have I tried for X, what worked, what failed” and get a synthesized answer with claude --resume jump links.
  • Verdict: Closest semantic match to “automated drift detection” use case. The cross-session synthesis via in-session Haiku subagent is exactly the pattern for drift detection (compare persona behavior across N sessions, flag deviations). Single-machine. Python deps. License is clean.
  • Decision: Strong steal candidate for the drift-detection module specifically. Even if we use mnemo/agentsview for storage, lee-fuhr’s synthesis recipe is the pattern.

2.1 Indexers / Search / Memory (the core ask)

Section titled “2.1 Indexers / Search / Memory (the core ask)”
RepoStarsLicenseLast CommitMaturityFitNotes
marcelocantos/mnemo0Apache-2.02026-05-11Production-shaped, new releaseFull fitMCP + dashboard + FTS5 + decisions extraction + image OCR. See top candidates.
wesm/agentsview928MIT2026-05-11ProductionFull fit (analytics)24-agent support, PG sync, web UI. See top candidates.
lee-fuhr/claude-session-index20MIT2026-04-30ProductionPartial — synthesisPython, SQLite FTS5, cross-session synthesis via Haiku.
lokkju/claude-session-index0Other2026-03-30Production-shapedPartial — indexer + MCPPython, SQLite FTS5, MCP stdio, plugin with auto-index hooks, subagent parent linking. Diff schema from lee-fuhr.
Jeremyx000/claude-session-index1MIT2026-05-11Demo/forkNo fitProbably a fork of lee-fuhr’s.
Alfredvc/cct3MIT/Apache2026-05-08ProductionPartial — DuckDB + skillsSee top candidates.
spences10/ccrecall13MIT2026-05-11ProductionPartial — clean schemaTS/Vite, SQLite, incremental sync, team/swarm tables, exposed raw SQL.
ahmedelgabri/ccpeek29MIT2026-04-28ProductionPartial — broad indexGo + web UI, indexes conversations + plans + todos + shell snapshots + file history + paste cache + memories + commands. Wider scope than just transcripts.
kateleext/deja12MIT2026-04-09ProductionPartial — episodic memorySkill-based, todos-become-episodes structure, ranks by signal strength + recency. Different mental model.
Kaidorespy/memory0MIT2026-04-23DemoNo fitOne-off “extract memories” tool.
yudppp/claude-code-history-mcp102026-03-01DemoPartialMCP server for searching CC history. Less feature-rich than mnemo.
mblode/claude-code-search5MIT2026-02-14DemoNo fitSmall CLI search. Superseded by lee-fuhr/lokkju.
randlee/claude-history3MIT2026-05-11In-progressNo fitCLI only.
S2thend/claude-code-history22026-04-04DemoNo fitCLI + TS lib for browse/search/export.
sthadka/claude-code-history02026-02-16DemoPartial — backup patternPython backup tool with structured exports and manifest-based incremental updates. The manifest pattern is worth borrowing for multi-machine sync.
Ayjc/claude-code-history02026-01-28DemoNo fitFuzzy autocomplete only.
john-parsneau/transcript-saver-mcp0Apache-2.02026-01-05DemoNo fitSaves running transcripts to markdown.
RepoStarsLicenseLast CommitMaturityFitNotes
simonw/claude-code-transcripts1,507Apache-2.02026-05-11ProductionNo fit (renderer only)Simon Willison’s HTML transcript publisher. Useful for sharing specific sessions, not for indexing. Web-session API features currently broken (issue #77).
daaain/claude-code-log1,014MIT2026-05-11ProductionPartial — TUI + filtersPython CLI with TUI for browsing, HTML/MD generators, date filtering, token tracking. Closest “polished CLI viewer.” Worth using as a daily-driver inspection tool.
jhlee0409/claude-code-history-viewer1,236MIT2026-05-11ProductionPartial — viewerSee top candidates.
kylesnowschwartz/tail-claude136MIT2026-05-06ProductionNo fitBubble Tea Go TUI, live tailing of a single session.
vtemian/claude-notes77MIT2026-04-24ProductionNo fitTerminal + HTML renderer.
philipp-spiess/claude-code-viewer202026-05-10In-progressNo fitWeb upload viewer.
yashagldit/Claude-Code-History-VSCode29Other2026-05-11ProductionNo fitVSCode extension.
yanicklandry/claude-code-history-viewer72026-04-02DemoNo fitSmaller alt viewer.
huangyunbin/claude-code-history-viewer102026-04-22DemoNo fitWebpage viewer.
ClickHouse/alexeyprompts11Other2026-04-19DemoPartial — CH backendOfficial ClickHouse-team viewer using ClickHouse columnar store. Interesting for scale; overkill for 5 machines.
frederikb96/orbit0MIT2026-03-02DemoNo fitReal-time streaming viewer.
caioaao/claude-code-session-visualizer12026-02-14DemoNo fitQuick-and-dirty web UI.
xlvlxn/claude-session-viewer02026-04-19DemoNo fitDrop-file GitHub Pages viewer.
tlskins/claude-log-viewer02026-01-13DemoNo fitView + share.
offpolicy/cc-transcripts02026-04-29DemoNo fitConversation-format viewer.
limbooo/ClaudeCodeHistoryViewer1MIT2025-12-08DemoNo fitSingle-file web app.
ianhandy/claude-session-dashboard02026-05-06DemoNo fitToken burn/cost dashboard.
ianhandy/multi-agent-session-visualizer02026-05-06DemoPartial — multi-agent flowD3.js flow diagrams, multi-agent support. Interesting visualization layer for fleet sessions.

2.3 Cost / Usage Analytics (ccusage family)

Section titled “2.3 Cost / Usage Analytics (ccusage family)”
RepoStarsLicenseLast CommitMaturityFitNotes
ryoppippi/ccusage14,042Other2026-05-11ProductionPartial — cost onlyThe canonical token/cost analyzer. JSONL-based. Use as the cost baseline; do not extend.
cobra91/better-ccusage69Other2026-05-10ProductionNo fitccusage fork with multi-provider support.
hydai/ccstat8MIT2026-02-21In-progressNo fitRust rewrite of ccusage.
SDpower/ccusage_go14MIT2026-05-09In-progressNo fitGo rewrite of ccusage.
m6k/ccusage-py32026-04-22DemoNo fitPython port.
elct9620/ccmon4Apache-2.02026-03-19DemoNo fitccusage-inspired monitor.
~20 menubar/statusbar/tmux/neovim/vscode wrappers around ccusage3–42mixedvariesmixedNo fitSurface-layer tooling, not indexers. (e.g. cctray, ccowl, AgentLimits, vibepulse, ccusage.nvim, etc.)
RepoStarsLicenseLast CommitMaturityFitNotes
ymonster/cc_jsonl_fix0MIT2026-05-01DemoPartial — repairRepairs broken parentUuid chains. Worth keeping as a recovery tool.
Lightcone-ZhangYifa/claude-replay-plugin0MIT2026-05-07DemoPartial — byte-perfect replayRecovers lost git history from JSONLs, supports project resurrection and AI-agent behavior analysis. Interesting for the “drift” angle — analyzing what an agent actually did vs what it should have done.
achiii800/claude-snap1MIT2026-05-11In-progressNo fitPortable snapshot codec for moving sessions between machines — could be useful as the wire format for multi-machine sync.
krzemienski/session-insight-miner0MIT2026-05-02DemoNo fitPer-session cost/cache reports.
Joopsnijder/claude-code-transcripts02026-04-22DemoNo fitDaily AI summaries.
knivram/cc-session-visualizer12026-03-17DemoNo fitBun CLI → HTML sequence diagrams.
mtschoen/claude-walker02026-05-11In-progressNo fitMulti-language pace-walker w/ shared conformance corpus. Maybe useful for parser cross-checks.
nagstler/pr-narrator2Apache-2.02026-05-02DemoNo fitTranscript → PR description.
yanndebray/claude-code-scrubber0MIT2026-03-04DemoNo fitPII scrub.
k33bs/ticktock2MIT2026-05-09DemoNo fitInline timestamps.
palimondo/xs02026-03-05DemoNo fitTranscript exploration tools.
shinyaohtani/claude-trace02026-04-16DemoNo fitJSONL → MD.
asdf8601/cc-gist-export0MIT2026-04-18DemoNo fitExport to gist.
jonathanmejia4/claude-session-replay0MIT2026-04-15DemoNo fitJSONL → transcript.
derekbreden/jsonl2md02026-05-01DemoNo fitJSONL → MD.
davesque/claude-md-transcripts0MIT2026-05-08DemoNo fitJSONL → MD + qmd index.
tayiorbeii/claude-code-transcript-cleanup-skill12025-12-01DemoNo fitCleanup skill for /export output.
prbe-ai/prbe-agent-tap0MIT2026-04-28DemoNo fit (SaaS)Daemon that ships transcripts to a hosted SaaS (api.prbe.ai). Worth flagging — competing pattern.
jordangarrison/panko0MIT2026-04-09DemoNo fitView + share.
elischwerzler/scribe02026-04-24DemoNo fitTranscripts → searchable book.
ysamlan/agent-log-gif5Apache-2.02026-04-28DemoNo fitTranscripts → animated GIFs.

2.5 Memory / Long-Term-Memory MCPs (overlapping space)

Section titled “2.5 Memory / Long-Term-Memory MCPs (overlapping space)”
RepoStarsLicenseLast CommitMaturityFitNotes
srijanshukla18/claude-memory-viz992026-04-22ProductionNo fitVisualizer for the Claude memory MCP graph, not transcript-aware.
WhenMoon-afk/claude-memory-mcp67MIT2026-05-04ProductionNo fitGeneric MCP memory server, not transcript-indexed.
serkansmg/smg-claude-memory-mcp182026-05-10ProductionPartial — vectorPer-project vector memory with rules enforcement and git-based team sharing. The git-sync pattern is relevant for fleet.
~25 other claude-memory-mcp named repos0–5mixedvariesmostly demoNo fitGeneric note/memory stores. None reach into transcript JSONL.

3. Gap Analysis — What No Existing Tool Covers

Section titled “3. Gap Analysis — What No Existing Tool Covers”

Even if we adopt mnemo + agentsview + cct’s skill pattern + lee-fuhr’s synthesis recipe, the following are still on us to build:

  1. Multi-machine sync. Every tool surveyed is single-host. We need to either:

    • rsync ~/.claude/projects/ from all 5 machines into a canonical location on Cheesegrater on a schedule, then point one mnemo+agentsview at it, OR
    • run mnemo/agentsview on each host and use agentsview’s pg push to consolidate into a single Postgres on Cheesegrater.
    • The agentsview pg push path is the cleanest because it is already implemented and supported.
  2. Fleet-aware variant identity. None of the indexers know about Wes’s PersonaName/canonical_name layer (Pepper, Stark, Lens, Bilby, etc.). They key on cwd and session ID. We need a join table from (machine, cwd, session_id)canonical_variant_name so drift detection can ask “is Stark drifting” rather than “is session 7b22… drifting.” Cheapest fix: maintain that map in D1 / vault and join at query time.

  3. Drift detection logic. Closest prior art: lee-fuhr’s “what have I tried for X” synthesis pass. No tool has explicit persona-drift, role-violation, or canon-divergence detection. This is genuinely novel and is our actual contribution. Pattern to steal: lee-fuhr’s in-session Haiku synthesizer over FTS5 hits.

  4. Organization passes. No tool re-tags or restructures sessions automatically after the fact (e.g. “all sessions touching Covenant Wildlife in the last 30 days → tag as client:covenant, surface uncommitted decisions”). We build this on top of whatever DB we land on.

  5. Cross-session decision diffing. mnemo extracts decisions, but nothing diffs decisions across sessions to flag contradictions (“session A: ‘we keep hines-mcp’, session B: ‘hines-mcp is dead’”). Build on top.

  6. Hook-aware ingestion. Most tools assume passive JSONL parsing. Wes already has SessionEnd / PreCompact hooks writing handoffs and deep-transcript-scan artifacts. Ingestion should pick those up as authoritative summary inputs, not just the raw JSONL. None of the surveyed tools do this.

  7. Live tailing with fleet-wide subscribe. mnemo has filesystem watching on one host; nothing watches across hosts. Cheapest answer: run mnemo on each host, broadcast new-session events through the existing claude-peers broker.


Section titled “4. Recommended Architecture (preview, not the deliverable)”

Hooking the above together:

[5 machines]
~/.claude/projects/*.jsonl
├── (each machine) agentsview daemon → SQLite local
│ └── nightly: `agentsview pg push` → Postgres on Cheesegrater
├── (Cheesegrater) one mnemo instance pointed at an rsync'd
│ union of all machines' projects/ for deep search + decisions
└── (any host) cchv-server for human browsing
[Cheesegrater Postgres + mnemo SQLite]
[Drift-detection skill] (cct-pattern + lee-fuhr-pattern)
reads from PG + mnemo, joins to variant-identity map in D1/vault,
runs comparative passes, emits findings to vault.

Pick this apart in the next research pass.


5. Search Trail (so the next pass doesn’t redo work)

Section titled “5. Search Trail (so the next pass doesn’t redo work)”

GitHub CLI searches executed:

  • gh search repos "claude-code transcript" → 30 results, primary hit set
  • gh search repos "claude-code session jsonl" → 16 results, included claude-snap, ymonster/cc_jsonl_fix, claude-replay-plugin
  • gh search repos "ccusage" → 30 results, mostly menubar wrappers
  • gh search repos "claude-code history" → 30 results, all the cchv variants
  • gh search repos "claude-code search MCP" → 0 results
  • gh search repos "claude session index" → 3 results (lee-fuhr, Jeremyx, lokkju)
  • gh search repos "ccmonitor OR ccmanager OR cchistory OR cc-history" → 0 results
  • gh search repos "claude memory MCP" → 30 results (mostly generic memory MCPs, not transcript-aware)
  • gh search repos "claude-code multi-machine sync" → 0 results
  • gh search repos "claude-code fleet" → 12 results (orchestration tools, not indexers)
  • gh search repos "openai-conversation-export OR chatgpt-history-export" → 0 results
  • gh search repos "claude-code analytics drift" → 0 results

READMEs read in full:

  • marcelocantos/mnemo
  • wesm/agentsview
  • jhlee0409/claude-code-history-viewer
  • daaain/claude-code-log
  • Alfredvc/cct
  • lee-fuhr/claude-session-index
  • lokkju/claude-session-index
  • spences10/ccrecall
  • kateleext/deja
  • simonw/claude-code-transcripts
  • kylesnowschwartz/tail-claude
  • ahmedelgabri/ccpeek

NOT yet investigated but flagged for follow-up:

  • ClickHouse/alexeyprompts (if scale becomes an issue)
  • achiii800/claude-snap (if we need a portable session wire format)
  • Lightcone-ZhangYifa/claude-replay-plugin (drift/behavior-analysis angle)
  • sthadka/claude-code-history (manifest-based incremental sync pattern)
  • The claude-code-transcripts Rust crate on crates.io (typed parser w/ round-trip validation)

PyPI / npm / crates.io NOT searched directly — every relevant package surfaces through its GitHub repo. Skip unless something specific looks missing.


Do not write a new indexer. The combination of agentsview (cross-machine via pg push) + mnemo (deep MCP search on a unified rsync mirror) + a skill on top of those two for drift detection (cct + lee-fuhr pattern) covers ~80% of the problem with already-shipped code. The only genuinely new code is the variant-identity join and the drift-pass logic.