Skip to content

AI coding tool comparison

AI Coding Tool Conversation History — Comparative Research

Section titled “AI Coding Tool Conversation History — Comparative Research”

Goal: figure out how mature AI coding tools store, search, and export their session/chat history so we can borrow good patterns (and avoid the dumb ones) when designing the Claude Code .jsonl indexer.

Date: 2026-05-11. Sources cited inline.


What it is: Cursor itself does not expose a clean on-disk chat history. The default Cursor app stores chats inside ~/Library/Application Support/Cursor/ in opaque SQLite blobs (state.vscdb). The community workaround is the SpecStory VS Code/Cursor extension, which captures every conversation as markdown alongside the project.

  • Path: <repo>/.specstory/history/<timestamp>-<slug>.md (one markdown file per conversation, written into the project root next to .git).
  • Format: Plain markdown. Preserves the full conversation including code blocks and diffs. Auto-saved by default; manual “Save AI Chat History” command from the palette can cherry-pick and combine.
  • Built-in search UI: No structured UI — you grep the directory or rely on your editor’s file search. The point is “your IDE already searches markdown, use that.”
  • Format documented: Loosely. Docs describe the location and “preserves conversation including code blocks and diffs” but never publish a schema — intentionally, because it’s “just markdown.”
  • Per-event schema: Headings demarcate user/assistant turns; no timestamps, model names, or tool-call structure exposed in the markdown itself. Tool calls render as fenced code blocks. Token counts not stored.
  • Third-party tooling: The extension itself is the third-party tool for Cursor. The output then plugs into the rest of your stack (Obsidian, grep, Datasette, etc.) because it’s markdown.
  • Clever: Local-first, version-controllable (committed into the repo), zero lock-in, zero schema. Becomes part of the project’s institutional memory. Auto-save with no friction.
  • Bad: No timestamps inside the file, no tool-call metadata, no token counts, no branching. You cannot reconstruct what the agent actually did — only what was shown in the chat panel. Worthless for analytics.

URLs:


What it is: Terminal coding assistant. Logs chats to flat markdown by default; has an optional SQLite/Datasette mode.

  • Path:
    • <cwd>/.aider.chat.history.md — the long-running markdown log
    • <cwd>/.aider.input.history — user prompt history with ISO timestamps
    • <cwd>/.aider/ — caches and analytics JSON
    • $AIDER_CHAT_HISTORY_FILE env var overrides location.
  • Format: Markdown for chat. Each session starts with a banner # aider chat started at <ISO timestamp>. User turns are blockquoted lines prefixed ####, assistant turns are plain markdown, and aider’s own console output is prefixed >. Token/cost is logged inline: > Tokens: 2.3k sent, 13 received. Cost: $0.00024 message.
  • Input history format: Plain text with timestamps and +-prefixed prompts. Looks like:
    57.492882
    +y
  • Built-in search UI: No. There’s a /datasette slash command that boots Datasette against the SQLite history if you opt in.
  • Format documented: Markdown format is conventional, not specified. SQLite/Datasette integration came from PR #1860 and is loosely documented; the schema lives in the source, not in docs.
  • Per-event schema (markdown): No structured event metadata — timestamps only at session-start, token/cost in console lines. Tool calls are not a concept in aider; everything is search/replace blocks in the markdown.
  • Third-party tooling: Aider’s own /datasette is the main offering. Otherwise grep/awk.
  • Clever: Two files split intent (input.history is just what the human typed, timestamped, perfect for an editor history scrollback). Easy to skim by eyeball. Cost line is gold for usage analytics.
  • Bad: Per-cwd siloing means a history file per project; nothing global. No structured tool-call log. Markdown is lossy — round-tripping back into an LLM context is painful. The SQLite mode is opt-in, undocumented, and rarely used.

URLs:

Sample observed locally at ~/.aider.chat.history.md and ~/.aider.input.history (Jan 2026).


3. Cline (saoudrizwan.claude-dev, VS Code)

Section titled “3. Cline (saoudrizwan.claude-dev, VS Code)”

What it is: Heavyweight Claude-in-VS-Code extension. Most disciplined on-disk schema of any tool reviewed.

  • Path (macOS): ~/Library/Application Support/Code/User/globalStorage/saoudrizwan.claude-dev/
    • state/taskHistory.json — flat index of all tasks (id, title, ts).
    • tasks/<task-id>/ — one folder per task with three files:
      • api_conversation_history.json — full Anthropic Messages API payload (the array of {role, content[]} exactly as sent to Claude).
      • ui_messages.json — what was rendered in the panel (richer, includes api_req_started events with token counts streamed as updates).
      • task_metadata.json — title, created/updated timestamps, model.
    • checkpoints/ — workspace-scoped git-style checkpoints of file state.
  • Format: JSON files, one per concern. Anthropic-native message format for the actual transcript.
  • Built-in search UI: Task list inside the extension panel, no full-text search across tasks.
  • Format documented: Reverse-engineered, not formally published. DeepWiki pages and a docs.cline.bot “task history recovery” page describe the layout enough to back up and restore.
  • Per-event schema: Mixed model.
    • api_conversation_history.json is the canonical Anthropic Messages schema: role: user|assistant, content: [{type, text|tool_use| tool_result|...}].
    • ui_messages.json adds Cline-specific event types: api_req_started with cost/token counts that update as the stream returns, conversationHistoryDeletedRange tuples tracking truncation windows.
  • Third-party tooling: None broadly known; community mostly hand-rolls scripts to scrape the task folders.
  • Clever: Splitting “what was sent to the API” from “what was shown in UI” is the right architecture — one is a perfect replay of LLM context, the other is rich for analytics/visualisation. conversationHistoryDeletedRange is a smart way to preserve the original transcript while still tracking what got pruned. Per-task folders trivialize delete/export.
  • Bad: Three files per task and a per-task folder explodes the filesystem (thousands of files for power users); no SQLite index means search is “load every JSON and grep.” Cline issue #3784 documents the performance death-spiral from cramming all history into VS Code’s globalState.

URLs:


What it is: Open-source IDE extension (VS Code + JetBrains). Treats session history as opt-in “development data” telemetry — that framing matters because the data is more analytics-oriented than transcript-oriented.

  • Path: ~/.continue/dev_data/ for the development-data event stream. Optional override via the data section of config.yaml (local dir or remote HTTP endpoint).
  • Format: JSONL files of typed event objects. Schemas are versioned and live under packages/config-yaml/src/schemas/data/ in the continuedev/continue repo.
  • Session transcripts: Separate from dev_data. The /share slash command exports the current chat to markdown at ~/.continue/session-transcripts/ (configurable). Continue does not expose a full history browser by default.
  • Built-in search UI: Recent chats appear in the sidebar dropdown. No full-text search.
  • Format documented: Yes for dev_data (versioned JSON schemas in the repo). Sparsely for session-transcripts markdown.
  • Per-event schema: dev_data has typed events for completions, acceptances, tokens, edits, model usage — explicitly designed for org-level analytics. Transcripts are flat markdown like SpecStory.
  • Third-party tooling: None notable; the project’s stance is “pipe dev_data to your own HTTP endpoint and build whatever you want.”
  • Clever: Two-tier model — event log for analytics (JSONL, versioned schemas, pipeable) and transcript for humans (markdown). Versioned event schemas mean external consumers don’t break across updates.
  • Bad: The two tiers don’t fully reconnect — you can’t easily go from a dev_data event back to the markdown transcript where it happened. Session-transcripts is manual (/share), not auto.

URLs:


What it is: Native macOS/Linux editor’s built-in AI panel. History storage migrated from JSON to a binary LMDB-style database over the past year.

  • Path (current, “Agent threads”):
    • Standard macOS/Linux: ~/.local/share/zed/threads/
    • Flatpak: ~/.var/app/dev.zed.Zed/data/zed/threads/threads-db.1.mdb/
  • Path (legacy “Assistant Context”): ~/.config/zed/conversations/*.json — one JSON per saved conversation.
  • Format: Currently an LMDB-style binary store (threads-db.1.mdb). Legacy assistant-context was JSON. The “Open as Markdown” button on a thread is the official export path.
  • Built-in search UI: Threads pane in the agent panel lists threads; search is limited to title/recency. Zed issue #41240 documents threads leaking across workspaces — the scoping story is still being worked out.
  • Format documented: No. Binary store is essentially undocumented; community had to dig to find the path (discussion #32335).
  • Per-event schema: Not externally documented. The markdown export shows role-tagged turns and inline tool calls/edits as fenced blocks.
  • Third-party tooling: None of note. The opaque binary store actively discourages it.
  • Clever: Native binary store should scale better than JSON-per-file.
  • Bad: Going from JSON-per-conversation to an opaque binary blob is a regression for power users — no inspectable schema, no grep, third-party tooling collapses, and “Open as Markdown” is per-thread manual. Cross- workspace bleed bug suggests storage rebuild was rushed.

URLs:


6. OpenAI Codex CLI (the new TypeScript/Rust one, not the deprecated v1)

Section titled “6. OpenAI Codex CLI (the new TypeScript/Rust one, not the deprecated v1)”

What it is: OpenAI’s official agentic CLI. Best-shaped storage of the non-Claude bunch.

  • Path: $CODEX_HOME/sessions/YYYY/MM/DD/rollout-<id>.jsonl (default CODEX_HOME=~/.codex). Each session is one JSONL file under a date-partitioned directory. There is also a ~/.codex/history.jsonl for prompt history with history.max_bytes cap.
  • Format: JSON Lines. Each line is an “event” — ThreadStarted, TurnStarted, TurnCompleted (with Usage tokens), TurnFailed (with ThreadError), ItemStarted/ItemUpdated/ItemCompleted (wrapping a ThreadItem of type agent message, reasoning, command exec, file change, MCP tool call, web search, or plan update). Same stream is emitted to stdout when --json is passed to codex exec.
  • Built-in search UI: codex resume opens a picker of recent sessions; codex resume --last jumps to most recent; codex resume <SESSION_ID> by ID. No full-text search.
  • Format documented: Partially. Non-interactive docs spell out the event types and shape; the desktop App Server doc also references it. Schema not formally versioned in docs the way Continue’s is.
  • Per-event schema (rough): {type, thread_id, ts, item?: ThreadItem, usage?, error?}. ThreadItem types cover messages, reasoning, exec, file_change, mcp_call, web_search, plan_update.
  • Third-party tooling:
    • cass / coding_agent_session_search (https://github.com/Dicklesworthstone/coding_agent_session_search) — multi-provider TUI/CLI indexer covering Codex, Claude Code, Cline, Cursor, Aider, Copilot, and ~13 others.
    • GitHub issue #20864 — Codex’s own desktop app gets laggy because it scans every rollout file rather than maintain its own index. That is the bug we are designing around.
  • Clever: Date-partitioned directories (YYYY/MM/DD) — trivial to ls and scope to a time range without touching content. Event-stream model (not message-stream) captures reasoning, plan updates, exec, MCP calls as first-class items. --json on codex exec emits the same event stream to stdout, so log-shipping is free.
  • Bad: No internal index, so the desktop app scans every rollout — same N+1 problem Claude Code has. JSONL files can grow large for long sessions (no chunking). Schema is event-typed but not formally versioned.

URLs:


What it is: Microsoft/GitHub’s chat panel. Storage is workspace-scoped and obfuscated by VS Code’s storage layer.

  • Path: <vscode>/User/workspaceStorage/<hash>/chatSessions/*.json — one JSON per chat session, under a workspace-hash folder. There is no cross-workspace global history. macOS path is ~/Library/Application Support/Code/User/workspaceStorage/.
  • Format: JSON, structure not officially documented. Each session captures prompts, responses, and code-context references.
  • Built-in search UI: Recent chats list in the Copilot panel. No full-text search across history. Chat: Export Chat... palette command exports a single session to JSON.
  • Format documented: No. Community reverse-engineered via the discussion threads below.
  • Per-event schema: Roughly {requestId, message{role,content}, references[]} — model name, timestamp, and context references included.
  • Third-party tooling:
    • Copilot Chat History extension (arbuzov.copilot-chat-history) — UI for browsing.
    • GitHub Copilot Chat Exporter (fengzehan.vscode-copilot-exporter) — bulk export.
  • Clever: Workspace-scoped storage prevents cross-project context leak (the bug Zed has). Native Chat: Export Chat... palette command is a nice affordance.
  • Bad: Workspace hash makes finding your own data hostile; no global history; community has built two separate extensions just to make the data accessible, which tells you everything.

URLs:


What it is: Sourcegraph’s IDE assistant. Was first to ship a built-in “Export chat history as JSON” button.

  • Path: Stored inside the IDE extension’s workspace state (VS Code: workspaceStorage, JetBrains: IDE-specific). No documented filesystem path; the supported access pattern is the in-app History panel.
  • Format: Internal storage opaque. Exposed as JSON via the explicit “Export” button per-chat or in bulk.
  • Built-in search UI: History panel lists chats; no full-text search.
  • Format documented: Export-side documented loosely as “the full conversation between Cody and the LLM”; no schema spec.
  • Per-event schema: Conversation array of role-tagged messages, contextual code references attached per message.
  • Third-party tooling: None that I could find — Cody’s own export button is enough for what most users want.
  • Clever: Shipped export before anyone else; treats history as the user’s data and gives them a one-click out. Web/browser variant warns about local-storage quota — they actually thought about the corner.
  • Bad: No filesystem access without exporting; export is per-chat or bulk-zip and not incremental. No documented schema means downstream tools have to re-derive it for every export.

URLs:


9. Roo Code (RooVeterinaryInc.roo-cline — Cline fork)

Section titled “9. Roo Code (RooVeterinaryInc.roo-cline — Cline fork)”

What it is: Cline fork that addressed the globalState blowup with a proper file-based store.

  • Path: Per-task JSON files under VS Code’s globalStorageUri.fsPath for the extension (i.e. same globalStorage neighbourhood as Cline). A legacy mirror lives in globalState["taskHistory"] for downgrade compatibility.
  • Format: JSON. TaskHistoryStore writes one file per task on first modification. initializeTaskHistoryStore() runs once to migrate from the legacy globalState array, then sets a "taskHistoryMigratedToFiles": true flag.
  • Built-in search UI: Task list in the panel; no full-text search. Settings UI ships import/export for global state.
  • Format documented: Architecture documented on DeepWiki; the migration logic and store class names are openly referenced in their issue threads.
  • Per-event schema: Same Cline lineage — Anthropic Messages format for the API history, separate UI messages with token/cost annotations.
  • Third-party tooling: Inherits from Cline ecosystem; no Roo-specific indexers seen.
  • Clever: Explicit migration with a one-shot flag (vs blind try/catch). Two-tier store (file primary, globalState mirror) means a downgrade doesn’t lose history. The fact that this fork exists primarily to fix the perf issue is itself a lesson: never put your whole history into a framework-managed key-value store with no on-disk affordance.
  • Bad: Same per-task folder explosion problem Cline has. Mirror to globalState is wasted I/O once everyone is on the new version.

URLs:


10. Existing multi-provider indexers (the gift)

Section titled “10. Existing multi-provider indexers (the gift)”

There are at least two projects worth treating as competitive prior art — not for stealing UI, but for stealing schema design and provider connectors.

  • https://github.com/Dicklesworthstone/coding_agent_session_search
  • Indexes 19+ providers into a unified Conversation → Message → Snippet SQLite schema.
  • SQLite as source of truth. Derived assets (lexical index, semantic vectors) are rebuildable from SQLite.
  • Two-layer index: Tantivy BM25 with edge n-grams for lexical (sub-60ms, search-as-you-type), optional ONNX MiniLM/Arctic/Nomic embeddings in .fsvi vector files for semantic, combined via Reciprocal Rank Fusion when both are warm. Graceful degrade to lexical-only when embeddings aren’t ready.
  • Connector pattern: each provider has a thin extractor that maps its native format (JSONL / JSON / SQLite / markdown) to the canonical schema.

Claude-Code-specific indexers (the direct competition)

Section titled “Claude-Code-specific indexers (the direct competition)”

Field summary cited in https://dev.to/gonewx/i-tested-4-tools-for-browsing-claude-code-session-history-17ie.


Claude Code’s own JSONL schema (for reference)

Section titled “Claude Code’s own JSONL schema (for reference)”

So the comparison has a baseline. Per https://databunny.medium.com/inside-claude-code-the-session-file-format-and-how-to-inspect-it-b9998e66d56b:

{
"type": "assistant",
"uuid": "3fa85f64-...",
"parentUuid": "1a2b3c4d-...",
"timestamp": "2025-02-20T09:14:32.441Z",
"sessionId": "abc123",
"cwd": "/home/user/myapp",
"message": { "role": "assistant", "content": [ ... ], "usage": { ... } }
}

Key shape facts:

  • type ∈ {user, assistant, tool_result, system, summary, result, file-history-snapshot}
  • parentUuid makes it a DAG, not a list — branches/sidechains exist.
  • message.content[] blocks: text, tool_use {id,name,input}, thinking.
  • tool_result records carry toolUseResult{tool_use_id, content, is_error}.
  • assistant.message.usage carries input/output tokens and cache metrics.
  • Files live at ~/.claude/projects/<cwd-with-dashes>/<session-uuid>.jsonl.

  1. SQLite as source of truth, derived indexes rebuildable. From cass. Don’t try to keep the JSONLs and a search index in lock-step at write-time — ingest JSONL → SQLite once, then derive everything (FTS5, embeddings, topics, tool counts) from the SQLite. Re-ingest is cheap and idempotent; the JSONLs are append-only so a last_offset_indexed per file lets you incrementally ingest in O(new-bytes).

  2. Two-table model: api_messages (canonical replay) + ui_events (rich analytics). From Cline. The Anthropic Messages format gives you a perfect LLM-replay payload; a separate event stream gives you per-tool-call timings, token usage, and thinking blocks for analytics without mangling the replay table.

  3. FTS5 for lexical + optional local embeddings for semantic, RRF combine, lexical-degrade when embeddings cold. From cass. Don’t ship embeddings as a hard dependency; ship FTS5 day one, embeddings as a background opt-in.

  4. Event-typed schema, not message-typed. From Codex. A session is a stream of typed events (turn_started, tool_call, file_change, plan_update, reasoning, mcp_call, web_search) — not a list of role-tagged strings. Claude Code’s JSONL already supports this via type + message.content[].type; preserve that fidelity in SQL.

  5. Date-partitioned directory layout for raw artifacts. From Codex (sessions/YYYY/MM/DD/). Trivially scope by time without touching content. Claude Code partitions by cwd instead, which is fine but complementary — store both cwd and started_at indexed columns.

  6. Versioned schemas for any event stream you expose. From Continue. If you export events for downstream tooling (or your own future-self), put a schema_version field on every row and keep old versions parseable.

  7. parent_uuid graph preserved, not flattened. From Claude Code’s own format. Resist the urge to linearize the conversation at ingest — you lose branching, sidechain subagents, and the ability to reconstruct what context the assistant actually saw. Store parent_uuid and is_sidechain; render linear if you need to.

  8. Per-session “what got truncated” record. From Cline (conversationHistoryDeletedRange). When the agent prunes context, record the range — invaluable for debugging “why did it forget X.”

  9. Token/cost tracked as a streamed update, not a one-shot final. From Cline’s api_req_started event. Token counts are updated as the stream returns; persist the final numbers and the streaming series both if you can afford it.

  10. Markdown export as a first-class artifact, not the storage format. From SpecStory and Codex (Open as Markdown). Humans browse markdown, machines query SQLite. Don’t conflate the two.

  11. Built-in --json event stream from the agent itself. From Codex (codex exec --json). Even if Claude Code doesn’t expose this, the indexer can emit a normalized event stream over stdout for log shippers.

  12. MCP-call detection by naming convention (mcp__<server>__<tool>). From claude-code-trace. Render MCP calls as structured server+action, not as a generic tool_use block.

  13. One-shot migration flag, not blind detection. From Roo Code’s taskHistoryMigratedToFiles. If you ever change schema, gate the upgrade on a stored flag rather than re-checking on every boot.


  1. Putting all history into the framework’s KV store (globalState). Cline and Roo both crashed VS Code doing this. If you persist per-event data, persist it to a real database, not to your harness’s state bag.

  2. Opaque binary storage with no schema doc. Zed’s move from ~/.config/zed/conversations/*.json to threads-db.1.mdb. Once you go binary-without-spec, your third-party ecosystem dies and your own users can’t grep their own data. Issue #41240 (cross-project bleed) is the tell that they don’t fully understand the new store either.

  3. Per-workspace siloing with no global index. Copilot Chat. Users want both “search across everything I’ve ever done” and “scope to this project.” Build the global index, expose the scope filter.

  4. One-file-per-task on disk with no SQLite layer. Cline. tasks/<id>/{api_conversation_history,ui_messages,task_metadata}.json means N+1 reads to do anything cross-task. Mirror them to SQLite at ingest.

  5. Scanning every JSONL on every search. Codex’s own desktop app and Claude Code’s own UX both do this. The whole point of our project is to not.

  6. No timestamps in the human-readable format. SpecStory and aider both bury or omit timestamps in the markdown body. If markdown is your format, put an ISO timestamp on every turn.

  7. Storing tool calls as fenced code blocks in markdown. SpecStory. You lose tool name, input args, success/failure, duration — everything you’d want for analytics or debugging.

  8. Two-tier event-vs-transcript with no join key. Continue. If you keep an analytics event log and a session transcript, they must share a session_id (and ideally message_id / event_id) so you can hop between them.

  9. Manual export as the only way out. Cody and Zed both make the user click an Export button per chat. Continuous tail-and-index beats manual export.

  10. No cwd on events. Useless for project-scoped search. Claude Code already has it; preserve it.

  11. Re-flowing the conversation at ingest (linearization). Loses the parent_uuid DAG, loses sidechains, loses what context the assistant actually saw.

  12. Stuffing token-usage updates into ephemeral streaming events with no final snapshot row. If the stream dies, you’ve lost the usage data. Always persist a final turn_completed row with the consolidated numbers.


Recommendation summary for the Claude Code indexer

Section titled “Recommendation summary for the Claude Code indexer”

Given the prior art:

  • Backend: SQLite with FTS5. Tables: sessions, messages, events (typed), tool_calls, topics (derived), embeddings (optional). One DB, one source of truth.
  • Schema fidelity: Preserve uuid, parent_uuid, is_sidechain, cwd, session_id, timestamp, type, full message.content[] JSON, usage.*. Don’t flatten.
  • Ingest: Incremental tail of ~/.claude/projects/*.jsonl via watermark per file (last_offset_indexed). Re-ingest is idempotent on uuid.
  • Output surfaces: CLI for search, show <session>, tail. Markdown export per session. Optional Datasette/web view of the SQLite (free for having SQLite).
  • Schema version: Tag every row with a schema_version column. Migrations gated by a flag.
  • MCP rendering: Decode mcp__<server>__<tool> for display.
  • Don’t: ship embeddings as a hard requirement, store anything in Claude Code’s own state, write binary blobs, or linearize the DAG.

Length: ~470 lines, under cap.