← Back to Evals

vecdb-bench

A Bun + TypeScript benchmark evaluating three embedded vector database stacks for hybrid semantic search, targeting code indexing/RAG and fantasy book RAG use cases.

Stacks Evaluated

Stack Components
LanceDB Native vector search + FTS + RRF hybrid fusion
DuckDB + VSS + FTS duckdb with vector similarity search & full-text search extensions
SQLite + FTS5 + sqlite-vec bun:sqlite with sqlite-vec for vectors + FTS5 for full-text

Environment Variables

Copy .env.example to .env and fill in your API keys and model configuration. See .env.example for all required variables.

Methodology

  • Datasets: 20 code snippets + 20 fantasy book passages, each with 20 queries and ground-truth relevance labels
  • Search methods: vector-only, FTS-only, hybrid (RRF fusion), hybrid + reranker
  • Metrics: Precision@5, Recall@5, MRR, NDCG@5, average latency, P95 latency, indexing time
  • Latency measurement: Only DB query time is measured — embedding and reranker API latency is excluded so results reflect pure database performance
  • Scoring: Combined score = 70% quality (weighted MRR/NDCG/Recall/Precision) + 30% performance (inverse latency)

Setup

git clone https://github.com/lemon07r/vecdb-bench.git
bun install
cp .env.example .env  # add your API keys and model config
bun run src/bench.ts

Options

Flag Short Default Description
--dimensions -d 1024 Embedding vector dimensions. Must be supported by your embedding model.

Example with custom dimensions:

bun run src/bench.ts --dimensions 512
bun run src/bench.ts -d 256

Results

Models Used

  • Embedding: Qwen/Qwen3-Embedding-0.6B (1024 dimensions) via SiliconFlow API
  • Reranker: Qwen/Qwen3-Reranker-0.6B via SiliconFlow API

Code Search Dataset (20 docs, 20 queries)

Engine Method P@5 R@5 MRR NDCG@5 Avg ms P95 ms Index ms
LanceDB vector 0.230 0.975 1.000 0.981 1.7 6.3 24
LanceDB fts 0.220 0.925 0.925 0.912 1.2 2.3 24
LanceDB hybrid 0.230 0.975 0.950 0.944 2.2 3.3 24
LanceDB hybrid+rerank 0.230 0.975 1.000 0.981 2.9 3.2 24
DuckDB + VSS + FTS vector 0.230 0.975 1.000 0.981 12.3 15.1 230
DuckDB + VSS + FTS fts 0.220 0.925 0.925 0.912 11.2 18.8 230
DuckDB + VSS + FTS hybrid 0.230 0.975 0.967 0.956 23.5 26.2 230
DuckDB + VSS + FTS hybrid+rerank 0.230 0.975 1.000 0.981 27.5 32.2 230
SQLite + FTS5 + sqlite-vec vector 0.230 0.975 1.000 0.981 0.3 2.1 5
SQLite + FTS5 + sqlite-vec fts 0.220 0.925 0.925 0.912 0.1 0.5 5
SQLite + FTS5 + sqlite-vec hybrid 0.230 0.975 0.967 0.956 0.5 2.2 5
SQLite + FTS5 + sqlite-vec hybrid+rerank 0.230 0.975 1.000 0.981 0.9 2.3 5

Fantasy Books Dataset (20 docs, 20 queries)

Engine Method P@5 R@5 MRR NDCG@5 Avg ms P95 ms Index ms
LanceDB vector 0.230 0.975 0.975 0.958 1.2 2.0 11
LanceDB fts 0.240 1.000 0.950 0.963 1.1 1.7 11
LanceDB hybrid 0.240 1.000 1.000 1.000 2.0 2.9 11
LanceDB hybrid+rerank 0.240 1.000 1.000 0.996 2.9 4.6 11
DuckDB + VSS + FTS vector 0.230 0.975 0.975 0.958 12.2 13.8 211
DuckDB + VSS + FTS fts 0.240 1.000 0.950 0.963 12.4 28.2 211
DuckDB + VSS + FTS hybrid 0.240 1.000 0.975 0.982 24.4 29.0 211
DuckDB + VSS + FTS hybrid+rerank 0.240 1.000 1.000 0.996 28.2 31.6 211
SQLite + FTS5 + sqlite-vec vector 0.230 0.975 0.975 0.958 0.3 1.9 4
SQLite + FTS5 + sqlite-vec fts 0.240 1.000 0.975 0.974 0.1 0.2 4
SQLite + FTS5 + sqlite-vec hybrid 0.240 1.000 0.975 0.982 0.5 2.2 4
SQLite + FTS5 + sqlite-vec hybrid+rerank 0.240 1.000 1.000 0.996 1.0 2.6 4

Aggregate Scores

Engine P@5 R@5 MRR NDCG@5 Avg Latency Avg Indexing Quality Perf Combined
SQLite + FTS5 + sqlite-vec 0.233 0.978 0.977 0.967 0.5ms 5ms 82.5% 99.5% 87.6%
LanceDB 0.233 0.978 0.975 0.967 1.9ms 17ms 82.5% 98.1% 87.2%
DuckDB + VSS + FTS 0.233 0.978 0.974 0.966 19.0ms 221ms 82.4% 84.1% 82.9%

Results with 8B Models (4096 dimensions)

Models Used

  • Embedding: Qwen/Qwen3-Embedding-8B (4096 dimensions) via Nebius API
  • Reranker: Qwen/Qwen3-Reranker-8B via SiliconFlow API

Code Search Dataset (20 docs, 20 queries)

Engine Method P@5 R@5 MRR NDCG@5 Avg ms P95 ms Index ms
LanceDB vector 0.230 0.975 1.000 0.975 1.8 6.7 26
LanceDB fts 0.220 0.925 0.925 0.912 1.2 2.1 26
LanceDB hybrid 0.230 0.975 0.967 0.956 2.4 3.8 26
LanceDB hybrid+rerank 0.240 1.000 1.000 1.000 3.0 4.3 26
DuckDB + VSS + FTS vector 0.230 0.975 1.000 0.975 49.5 54.4 659
DuckDB + VSS + FTS fts 0.220 0.925 0.925 0.912 12.5 24.9 659
DuckDB + VSS + FTS hybrid 0.230 0.975 0.975 0.962 58.1 66.5 659
DuckDB + VSS + FTS hybrid+rerank 0.240 1.000 1.000 1.000 69.7 77.4 659
SQLite + FTS5 + sqlite-vec vector 0.230 0.975 1.000 0.975 1.5 6.4 10
SQLite + FTS5 + sqlite-vec fts 0.220 0.925 0.925 0.912 0.1 0.5 10
SQLite + FTS5 + sqlite-vec hybrid 0.230 0.975 0.975 0.962 1.3 1.7 10
SQLite + FTS5 + sqlite-vec hybrid+rerank 0.240 1.000 1.000 1.000 5.3 8.2 10

Fantasy Books Dataset (20 docs, 20 queries)

Engine Method P@5 R@5 MRR NDCG@5 Avg ms P95 ms Index ms
LanceDB vector 0.240 1.000 1.000 0.996 1.5 5.4 16
LanceDB fts 0.240 1.000 0.950 0.963 1.2 2.4 16
LanceDB hybrid 0.240 1.000 0.975 0.982 2.3 3.0 16
LanceDB hybrid+rerank 0.240 1.000 1.000 0.994 3.0 4.6 16
DuckDB + VSS + FTS vector 0.240 1.000 1.000 0.996 47.9 62.5 656
DuckDB + VSS + FTS fts 0.240 1.000 0.950 0.963 12.0 18.6 656
DuckDB + VSS + FTS hybrid 0.240 1.000 1.000 0.996 57.3 64.4 656
DuckDB + VSS + FTS hybrid+rerank 0.240 1.000 1.000 0.994 69.8 78.3 656
SQLite + FTS5 + sqlite-vec vector 0.240 1.000 1.000 0.996 1.3 6.6 5
SQLite + FTS5 + sqlite-vec fts 0.240 1.000 0.975 0.974 0.1 0.3 5
SQLite + FTS5 + sqlite-vec hybrid 0.240 1.000 1.000 0.996 1.7 3.8 5
SQLite + FTS5 + sqlite-vec hybrid+rerank 0.240 1.000 1.000 0.994 5.3 7.8 5

Aggregate Scores

Engine P@5 R@5 MRR NDCG@5 Avg Latency Avg Indexing Quality Perf Combined
SQLite + FTS5 + sqlite-vec 0.235 0.984 0.984 0.976 2.1ms 8ms 83.2% 98.0% 87.6%
LanceDB 0.235 0.984 0.977 0.972 2.0ms 21ms 82.9% 98.0% 87.4%
DuckDB + VSS + FTS 0.235 0.984 0.981 0.975 47.1ms 658ms 83.1% 68.0% 78.5%

8B vs 0.6B Comparison

  • 8B models improve hybrid+rerank quality — Code Search achieves perfect 1.000 NDCG@5 across all engines (vs 0.981 with 0.6B)
  • Fantasy Books vector search improves — 0.996 NDCG@5 with 8B vs 0.958 with 0.6B
  • DuckDB latency increases significantly with 4096d vectors — ~50ms vector search vs ~12ms at 1024d
  • SQLite and LanceDB handle 4× larger vectors well — latency increase is minimal (1–5ms range)
  • Overall ranking unchanged — SQLite > LanceDB > DuckDB regardless of model size

Reranking Impact Analysis

Comparing hybrid search quality with and without the reranker, and across model sizes (averaged across all three engines):

Configuration Code MRR Code NDCG@5 Fantasy MRR Fantasy NDCG@5
0.6B hybrid (no rerank) 0.961 0.952 0.983 0.988
0.6B hybrid+rerank 1.000 0.981 1.000 0.996
8B hybrid (no rerank) 0.972 0.960 0.992 0.991
8B hybrid+rerank 1.000 1.000 1.000 0.994

Key observations:

  • Reranking consistently improves quality — MRR jumps to perfect 1.000 in all cases, meaning the correct document is always ranked first after reranking.
  • 8B without reranker ≈ 0.6B with reranker on Code Search — 8B hybrid achieves 0.960 NDCG@5 vs 0.6B hybrid+rerank at 0.981. The gap is small enough that for latency-sensitive applications, using a larger embedding model without reranking can be a viable alternative to a smaller model with reranking.
  • Fantasy Books benefits less from reranking — even 0.6B hybrid without reranking already scores 0.988 NDCG@5, leaving little room for improvement. The natural language in fiction is easier to match than code.
  • The biggest quality gain from reranking is on Code Search with 0.6B — NDCG@5 jumps from 0.952 → 0.981 (+3.0%), while 8B reranking pushes it to a perfect 1.000.
  • Diminishing returns at 8B — the reranker adds less value with 8B embeddings since the base retrieval is already stronger, especially on Fantasy Books where 8B hybrid+rerank (0.994) actually scores marginally lower than 8B hybrid alone on some engines.

Key Takeaways

  • Quality is virtually identical across all three engines (~0.97 MRR, ~0.97 NDCG@5 on hybrid+rerank). The reranker equalizes any quality differences.
  • SQLite is the fastest — 0.1–1.0ms per query, 4–5ms indexing. ~40× faster than DuckDB, ~4× faster than LanceDB.
  • LanceDB has the best hybrid search quality on Fantasy Books — perfect 1.000 NDCG@5 on hybrid without reranker.
  • DuckDB is slowest at 12–28ms per query and 211–230ms indexing, but still fast in absolute terms.

Recommendations

Use Case Recommendation
Maximum raw speed SQLite + FTS5 + sqlite-vec
Best developer ergonomics LanceDB (simplest API, built-in hybrid)
Analytics + search combo DuckDB + VSS + FTS
Production RAG pipeline Any — quality is identical with reranker; pick based on ecosystem fit