Indexed regex search for codebases. 6–25× faster than ripgrep, sub-200ms per query, git-independent. Built so coding agents stop waiting for grep.
Coding agents call grep on every reasoning step. Each call blocks the next thought. Across a session, slow search compounds into minutes of dead time per task — the single biggest latency leak in any agent harness.
Cursor's blog on indexed regex search lit the spark. They tied their index to git commits. We didn't: fast-grep watches any directory, kept fresh by an FS daemon, and works on non-git repos, generated artifacts, and unstaged changes. It's the fastest open-source indexed grep we've measured.
Five techniques combine to eliminate >99% of file I/O before the regex engine runs.
Variable-length substrings whose boundaries fall on rare bigram pairs (computed from the corpus itself). Fewer, longer, more selective posting lists than fixed trigrams.
Two 8-bit Bloom filters per (n-gram, document) pair encode position-mod-8 and successor character. Drops the false-positive rate to 0.42% before any I/O.
Binary index files memory-mapped at query time. 17ms cold load regardless of corpus size — the OS pages in only the lists you touch.
Index stores byte offsets of candidate lines, not just file IDs. Verification jumps directly to the suspicious line instead of scanning the whole file.
Before invoking the regex engine, fast-grep checks a 4-byte content prefix per candidate line. Eliminates >95% of remaining I/O.
Deep dive: Techniques · vs ripgrep · source code
Type a regex (or paste a real one) and see how fast-grep would decompose it into trigrams to look up in the index. Falls back to a full scan when there's no usable literal run.
This demo uses fixed 3-grams for clarity. The real engine uses corpus-adaptive sparse n-grams (variable length, bigram-rarity weighted) which produce fewer, more selective lookups — but the decomposition pattern matching shown here (literal vs alternation vs regex) is exactly what src/searcher.rs implements.
The 5-stage path a query takes through fast-grep, with realistic numbers from the Linux kernel benchmark (EXPORT_SYMBOL over 81 690 files):
Steps 1–4 take ≈10 ms. Step 5 (the only stage that touches actual file bytes) is where the 4-byte prefix filter eliminates >95% of the I/O. End-to-end this query returns 197 matches in 197 ms — vs 1 553 ms for ripgrep.
Three zoom levels, from "where does this fit" to "what runs in the search hot path". Mermaid renders these in your browser.
Who uses fast-grep, and what does it talk to?
graph TB dev["👤 Developer
shell user"] agent["🤖 Coding agent
(Cursor, Claude Code, Aider)"] fgr(("fast-grep
CLI + optional daemon")) fs[(File system
any directory)] dev -- "fgr 'pattern' /path" --> fgr agent -- "tool-call: search" --> fgr fgr -- "mmap reads" --> fs fgr -- "watch + read" --> fs
Inside fast-grep there are two cooperating processes plus the on-disk index.
graph LR cli["fgr CLI
(Rust binary, single static)"] daemon["fgr daemon
(optional, FS-watcher)"] index[(Index files
mmap'd: postings, bitmaps,
lookup, meta.json)] fs[(File system)] cli -- "build / load" --> index cli -- "TCP localhost
(flush before search)" --> daemon daemon -- "incremental
updates" --> index daemon -- "notify::Watcher
(debounced 3s)" --> fs cli -- "verify reads" --> fs
The internals of one indexed query, top-to-bottom in execution order.
graph TB cli["CLI / argument parser"] pat["Pattern decomposer
(extract literal runs &
alternation branches)"] load["Index loader
(mmap meta.json + postings)"] ngram["Sparse n-gram extractor"] posting["Posting list intersection"] bloom["Position-mask Bloom filter
(Blackbird locMask + nextMask)"] prefix["4-byte content prefix filter"] verify["Line-level regex verify"] out["Output renderer
(grouped/colored or piped)"] cli --> pat --> ngram --> posting --> bloom --> prefix --> verify --> out load -.-> posting load -.-> bloom
The optional Metal GPU pre-filter (macOS, opt-in via FGR_METAL=1) lives between "prefix" and "verify" and runs literal scans on candidate lines as a Metal compute kernel.
Linux kernel 6.6 (81 690 files), Apple M1 Pro, warm cache. Numbers from the project's reproducible bench script.
| Pattern | fast-grep | ripgrep | Speedup |
|---|---|---|---|
TODO | 97 ms | 2 463 ms | 25× |
printk | 172 ms | 2 492 ms | 14× |
EXPORT_SYMBOL | 197 ms | 1 553 ms | 8× |
container_of | 344 ms | 2 440 ms | 7× |
static.*inline | 394 ms | 2 369 ms | 6× |
| Pattern | fast-grep | ugrep | Speedup |
|---|---|---|---|
EXPORT_SYMBOL | 197 ms | 1 898 ms | 9.6× |
TODO | 97 ms | 599 ms | 6.2× |
static.*inline | 394 ms | 1 595 ms | 4.0× |
printk | 172 ms | 645 ms | 3.8× |
container_of | 344 ms | 656 ms | 1.9× |
| Full build | ~60 s (one-time) |
| Incremental update | <1 s for 10–100 files (75× faster than rebuild) |
| Index load (mmap) | 17 ms |
| Index size | 775 MB postings + 161 MB bitmaps |
Where the project sits today, and where we want it to go. Numbers reflect cargo test --release output as of the latest release.
What a healthy pyramid for this project looks like — gaps to fill, ranked by ROI:
| Tier | Now | Target | Gap (why it matters) |
|---|---|---|---|
| Unit | 33 | ~80 | Coverage holes in persist.rs, daemon.rs, the output_matches/highlight_into renderer (the recent v0.3.1 work has zero unit coverage). |
| Property | 0 | ~10 | Invariants of the regex decomposer: literal runs of length ≥3 should always intersect-match the input; alternation splits should round-trip. proptest is the right tool. |
| Integration | 9 | ~25 | Daemon lifecycle (start → fs change → flush before search), incremental update against committed-then-modified files, --type filter combinations. |
| CLI snapshot | 0 | ~15 | Lock the user-facing TTY output (grouped/colored) and piped output (path:line:content) with insta snapshots so we don't silently regress the rendering. |
| Fuzz | 0 | 1 target | cargo-fuzz on the verifier: random pattern + random byte buffer should never panic and should match the regex crate's own behavior. |
| Bench (regression) | baseline JSON | CI gate | Wire scripts/bench.sh into a GitHub Action that compares against benches/baseline-v0.3.1.json and fails the build if any pattern regresses by >15%. |
All channels we control. Community packaging (apt, Fedora, Arch, MacPorts, Chocolatey…) welcome — see README.
# Cargo (any platform with a Rust toolchain)
cargo install fast-grep
# Prebuilt binary via cargo-binstall
cargo binstall fast-grep
# Homebrew (macOS / Linux)
brew install gmilano/fast-grep/fast-grep
# Scoop (Windows)
scoop bucket add fast-grep https://github.com/gmilano/scoop-fast-grep
scoop install fast-grep
# Debian / Ubuntu — .deb attached to every release
curl -LO https://github.com/gmilano/fast-grep-rust/releases/latest/download/fast-grep_0.3.1-1_amd64.deb
sudo dpkg -i fast-grep_*_amd64.deb
Then:
# Build the index once (auto-built on first search if missing)
fgr index /path/to/repo --output .fgr
# Search — sub-200ms on cached queries
fgr "EXPORT_SYMBOL" /path/to/repo --index .fgr
# Watch + auto-update on file changes
fgr daemon start /path/to/repo --output .fgr