Skip to content

Latest commit

 

History

History
78 lines (72 loc) · 2.61 KB

benchmark_results.md

File metadata and controls

78 lines (72 loc) · 2.61 KB

Benchmark results

Validation set

Model Accuracy Mean n-gram overlap
MixedBread
Baseline 46.24% 0.5093
20% overlap
Chunk Size 100 / Overlap 20 33.08% 0.3881
Chunk Size 200 / Overlap 40 47.37% 0.4970
Chunk Size 400 / Overlap 80 54.89% 0.5516
Chunk Size 800 / Overlap 160 62.41% 0.6137
40% overlap
Chunk Size 100 / Overlap 40 34.59% 0.4024
Chunk Size 200 / Overlap 80 47.37% 0.5027
Chunk Size 400 / Overlap 160 56.39% 0.5665
Chunk Size 800 / Overlap 320 63.91% 0.6279
60% overlap
Chunk Size 100 / Overlap 60 34.59% 0.4140
Chunk Size 200 / Overlap 120 50.38% 0.5255
Chunk Size 400 / Overlap 240 53.76% 0.5514
Chunk Size 800 / Overlap 480 55.26% 0.5686
80% overlap
Chunk Size 100 / Overlap 80 30.83% 0.3734
Chunk Size 200 / Overlap 160 45.11% 0.4819
Chunk Size 400 / Overlap 320 57.14% 0.5738
Chunk Size 800 / Overlap 640 56.02% 0.5575
Focus search
Chunk Size 500 / Overlap 200 56.39% 0.5751
Chunk Size 600 / Overlap 240 57.52% 0.5838
Chunk Size 500 / Overlap 250 57.52% 0.5753
Chunk Size 600 / Overlap 300 60.53% 0.6074
Chunk Size 500 / Overlap 300 57.14% 0.5797
Chunk Size 600 / Overlap 360 62.41% 0.6216
Stella
Chunk Size 600 / Overlap 240 61.65%
Chunk Size 500 / Overlap 200 60.90%
Chunk Size 600 / Overlap 360 60.53%
Chunk Size 500 / Overlap 250 59.77%
Chunk Size 600 / Overlap 300 58.27%
Chunk Size 500 / Overlap 300 55.26%
GTE (Embedding dimension 1024)
Chunk Size 600 / Overlap 360 63.16%
Chunk Size 600 / Overlap 300 60.90%
Chunk Size 500 / Overlap 250 59.40%
Chunk Size 500 / Overlap 300 57.52%
Chunk Size 600 / Overlap 240 57.14%
Chunk Size 500 / Overlap 200 56.02%
MixedBread (Embedding dimension 1024)
Chunk Size 600 / Overlap 360 64.29%
BM25
Top-k = 5 60.03%
Top-k = 30 76.46%
Hybrid Search Sequential
Top-k = 5 65.79%
Hybrid Search RRF
Top-k = 5 67.67%
Top-k = 20 79.70%
Top-k = 5, Top-l = 50 61.65%
Hybrid Search Linear Combination
Top-k = 5 64.29%
Top-k = 20 80.83%
Top-k = 5, Top-l = 100 70.68%
BM25 + Re-ranker
Top-k = 5, Top-l = 40 72.5%
Hybrid RRF + Re-ranker
Top-k = 5, Top-l = 20 72.5%
Hybrid LC + Re-ranker
Top-k = 5, Top-l = 20 72.18%

Test set

Model Accuracy Mean n-gram overlap
Baseline 0.46 % 0.4706