Document and benchmark performance tradeoffs between sqlite-vec and FAISS inline VectorDB providers #1165

franciscojavierarceo · 2025-02-20T03:42:08Z

🚀 Describe the new functionality needed

faiss is purely in-memory and sqlite-vec uses disk space.

Understanding the latency/memory tradeoffs and documenting for a single computer is likely sufficient to give users an understanding about the pros/cons of each VectorDB.

Additionally, as sqlite-vec adds new functionality (e.g., #1158), it would be useful to have a benchmark dataset for evaluating the retrieval efficacy. We could use, e.g., the CISI dataset for Information Retrieval for benchmarking.

💡 Why is this needed? What if we don't build it?

This is needed to give more options to users when using RAG. Some users may have a large set of documents and documenting the tradeoffs may help them make a better informed decision.

Other thoughts

No response

The text was updated successfully, but these errors were encountered:

yanxi0830 · 2025-02-20T03:53:01Z

Thanks! Would be great to have a comprehensive comparison on the available VectorDB. I think starting off with faiss v.s. sqlite-vec benchmarks would also guide us to the decision on which vector_db to provider as default templates.

https://github.com/zilliztech/VectorDBBench looks relevant that compares different vectorDBs out there.

franciscojavierarceo · 2025-02-20T04:02:46Z

Thanks @yanxi0830 ! I'll see if I can use that.

jwm4 · 2025-02-23T19:13:31Z

For anyone planning to work on this, I would note that there is an open issue ( #1082 ) to add inline Qdrant support too. I think that could also be a serious contender for inclusion in the default templates, but I agree that we'd want head-to-head benchmarks for speed and scalability to make an informed decision about this.

franciscojavierarceo · 2025-02-23T19:48:48Z

I am planning on working on this. I'll make the script available that does the analysis, so folks at Qdrant, Milvus, and others can also use it. 👍

franciscojavierarceo added the enhancement New feature or request label Feb 20, 2025

This was referenced Feb 20, 2025

Enhance Provider docs #1189

Closed

docs: Adding Provider sections to docs #1195

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document and benchmark performance tradeoffs between sqlite-vec and FAISS inline VectorDB providers #1165

Document and benchmark performance tradeoffs between sqlite-vec and FAISS inline VectorDB providers #1165

franciscojavierarceo commented Feb 20, 2025

yanxi0830 commented Feb 20, 2025

franciscojavierarceo commented Feb 20, 2025

jwm4 commented Feb 23, 2025

franciscojavierarceo commented Feb 23, 2025

Document and benchmark performance tradeoffs between sqlite-vec and FAISS inline VectorDB providers #1165

Document and benchmark performance tradeoffs between sqlite-vec and FAISS inline VectorDB providers #1165

Comments

franciscojavierarceo commented Feb 20, 2025

🚀 Describe the new functionality needed

💡 Why is this needed? What if we don't build it?

Other thoughts

yanxi0830 commented Feb 20, 2025

franciscojavierarceo commented Feb 20, 2025

jwm4 commented Feb 23, 2025

franciscojavierarceo commented Feb 23, 2025