dp_006 q08 100 files format_spanning Codex

Using the benchmark table in the v0.5.0 release notes, extract the complete performance comparison between v0.4.2 and v0.5.0 for all five corpus sizes. Present the data as a markdown table including: corpus name, file count, total size, v0.4.2 time, v0.5.0 time, and speedup.

Gold answer

| Corpus | Files | Total size | v0.4.2 | v0.5.0 | Speedup | |---|---|---|---|---|---| | Tiny (synthetic) | 10 | 500 KB | 48 ms | 12 ms | 4.0× | | Small (personal wiki) | 120 | 8 MB | 410 ms | 88 ms | 4.7× | | Medium (project docs) | 480 | 35 MB | 2,940 ms | 580 ms | 5.1× | | Large (Wikipedia subset) | 1,200 | 60 MB | 7,400 ms | 280 ms | 26.4× | | XL (synthetic, 1 GB) | 8,000 | 1,024 MB | OOM / killed | 4,100 ms | — | Hardware: M2 Max, macOS 14.5, --max-concurrency=8. Times are median of 5 runs, hyperfine --warmup 2.

Filesystem View raw JSON
Input
129k
Output
2k
Steps
1
Tools
3
Latency
29.6s
Cost
$1.60
#1 Ran rg -n \"v0\\.5\\.0|0\\.5\\.0|benchmark|v0\\.4\\.2|0\\.4\\.2|speedup|corpus\" agent_memory 75,853 chars
Command command_execution
rg -n \"v0\\.5\\.0|0\\.5\\.0|benchmark|v0\\.4\\.2|0\\.4\\.2|speedup|corpus\" agent_memory
Result 4 lines
agent_memory/releases/release-v0.5.0-2026-06-08.md:1:# kitabi v0.5.0 (2026-06-08) — Concurrent file processing
agent_memory/releases/release-v0.5.0-2026-06-08.md:24:cargo install kitabi --version 0.5.0
agent_memory/releases/release-v0.5.0-2026-06-08.md:31:| `x86_64-unknown-linux-gnu` | `kitabi-0.5.0-x86_64-unknown-linux-gnu.tar.gz` | `<sha256-placeholder-linux-x86_64>` |
agent_memory/releases/rele…
#1 Ran rg --files agent_memory 5,104 chars
Command command_execution
rg --files agent_memory
Result 8 lines
agent_memory/email/sponsor-outreach-bytebase.md
agent_memory/email/sponsor-confirm-astral.md
agent_memory/email/security-disclosure-cve-2026-31418.md
agent_memory/email/sponsor-outreach-rejected-tbdcorp.md
agent_memory/email/conf-invite-fosdem-acceptance.md
agent_memory/email/user-direct-thanks-newcomer.md
agent_memory/email/user-direct-question-licensing.md
agent_memory/email/user-direct-typo-rep…
#1 Ran sed -n '130,146p' agent_memory/releases/release-v0.5.0-2026-06-08.md 812 chars
Command command_execution
sed -n '130,146p' agent_memory/releases/release-v0.5.0-2026-06-08.md
Result 11 lines
## Benchmark detail

<details>
<summary>Full bench numbers by corpus size (M2 Max, macOS 14.5, --max-concurrency=8)</summary>

| Corpus | Files | Total size | v0.4.2 | v0.5.0 | Speedup |
|---|---|---|---|---|---|
| Tiny (synthetic) | 10 | 500 KB | 48 ms | 12 ms | 4.0× |
| Small (personal wiki) | 120 | 8 MB | 410 ms | 88 ms | 4.7× |
| Medium (project docs) | 480 | 35 MB | 2,940 ms | 580 ms | 5.1× …
Answer
From `agent_memory/releases/release-v0.5.0-2026-06-08.md`: | Corpus name | File count | Total size | v0.4.2 time | v0.5.0 time | Speedup | |---|---:|---:|---:|---:|---:| | Tiny (synthetic) | 10 | 500 KB | 48 ms | 12 ms | 4.0× | | Small (personal wiki) | 120 | 8 MB | 410 ms | 88 ms | 4.7× | | Medium (project docs) | 480 | 35 MB | 2,940 ms | 580 ms | 5.1× | | Large (Wikipedia subset) | 1,200 | 60 MB | 7,400 ms | 280 ms | 26.4× | | XL (synthetic, 1 GB) | 8,000 | 1,024 MB | OOM / killed | 4,100 ms …
Judge: covers all asked facts
SMFS Won View raw JSON
Input
111k
Output
2k
Steps
1
Tools
4
Latency
1.3m
Cost
$1.38
#1 Ran pwd && rg --files 3,923 chars
Command command_execution
pwd && rg --files
Result 10 lines
/Users/prasanna/Documents/supermemoryai/supermemoryfs/.scratch/eval/eval/workspace/smfs/dp_006/agent_memory
profile.md
releases/release-v0.6.2-prep-2026-06-30.md
releases/release-v0.6.0-2026-06-22.md
releases/release-v0.5.0-2026-06-08.md
releases/release-v0.4.0-2026-05-15.md
releases/release-v0.3.5-hotfix-2026-04-28.md
releases/release-v0.3.0-2026-03-31.md
releases/release-v0.2.0-2026-02-28.md
rel…
#1 Ran smfs grep "v0.5.0 release notes benchmark table v0.4.2 performance comparison corpus sizes" 6,515 chars
Command command_execution
smfs grep "v0.5.0 release notes benchmark table v0.4.2 performance comparison corpus sizes"
Result 5 lines
# supermemory semantic search — 10 results for "v0.5.0 release notes benchmark table v0.4.2 performance comparison corpus sizes"
# searches by meaning across files in this container. usage:
#   grep "natural language query"          search all files
#   grep "query" path/to/dir/              search within directory
# output: <filepath>:<line_start>-<line_end>:<chunk> (chunk text is verbatim from t…
#1 Ran nl -ba releases/release-v0.5.0-2026-06-08.md | sed -n '125,150p' 2,148 chars
Command command_execution
nl -ba releases/release-v0.5.0-2026-06-08.md | sed -n '125,150p'
Result 2 lines
   125	
   126	- **#41** — Performance regression on files >50 MB. Hiroshi reported in early April that processing a 60 MB file had gotten ~4× slower between v0.3.0 and v0.4.0, tracing to a quadratic re-allocation in the AST flattening pass. The concurrent scheduler naturally exposed this path and it was fixed as part of PR #84. See [issue #41](../../issues/issue-041-performance-regression-50mb.md…
#1 Ran rg -n \"Full bench numbers|\\| Corpus \\||\\| Tiny|\\| Small|\\| Medium|\\| Large|\\| XL|Times are median|XL corpus\" releases/release-v0.5.… 781 chars
Command command_execution
rg -n \"Full bench numbers|\\| Corpus \\||\\| Tiny|\\| Small|\\| Medium|\\| Large|\\| XL|Times are median|XL corpus\" releases/release-v0.5.0-2026-06-08.md
Result 6 lines
134:<summary>Full bench numbers by corpus size (M2 Max, macOS 14.5, --max-concurrency=8)</summary>
136:| Corpus | Files | Total size | v0.4.2 | v0.5.0 | Speedup |
138:| Tiny (synthetic) | 10 | 500 KB | 48 ms | 12 ms | 4.0× |
139:| Small (personal wiki) | 120 | 8 MB | 410 ms | 88 ms | 4.7× |
140:| Medium (project docs) | 480 | 35 MB | 2,940 ms | 580 ms | 5.1× |
141:| Large (Wikipedia subset) | 1,20…
Answer
From [release-v0.5.0-2026-06-08.md](/Users/prasanna/Documents/supermemoryai/supermemoryfs/.scratch/eval/eval/workspace/smfs/dp_006/agent_memory/releases/release-v0.5.0-2026-06-08.md:136): | Corpus name | File count | Total size | v0.4.2 time | v0.5.0 time | Speedup | |---|---:|---:|---:|---:|---:| | Tiny (synthetic) | 10 | 500 KB | 48 ms | 12 ms | 4.0× | | Small (personal wiki) | 120 | 8 MB | 410 ms | 88 ms | 4.7× | | Medium (project docs) | 480 | 35 MB | 2,940 ms | 580 ms | 5.1× | | Large (Wik…
Judge: covers all asked facts