Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Hybrid Search API using SQLite FTS5 + Vector search #1158

Open
varshaprasad96 opened this issue Feb 19, 2025 · 3 comments
Open

Introduce Hybrid Search API using SQLite FTS5 + Vector search #1158

varshaprasad96 opened this issue Feb 19, 2025 · 3 comments
Labels
enhancement New feature or request

Comments

@varshaprasad96
Copy link

varshaprasad96 commented Feb 19, 2025

🚀 Describe the new functionality needed

Currently, Llama-Stack supports optimized chunked writes (PR #1094) for efficient SQLite-based storage. However, there is no built-in Hybrid Search API that combines FTS5 and sqlite-vss to enable semantic and lexical retrieval.

This issue proposes the addition of a Hybrid Search API that allows users to:

  1. Store text documents with both full-text and vector embeddings.
  2. Perform hybrid search that ranks results by combining BM25-based text relevance and vector similarity.
  3. Utilize chunked writes (from PR feat: Chunk sqlite-vec writes #1094) to optimize insertions for large datasets.

Ref: https://github.com/liamca/sqlite-hybrid-search/tree/main - The idea would be take Reciprocal Rank Fusion between FTS5 and vector-based search results to ensure that highly ranked documents across multiple lists are prioritized.

💡 Why is this needed? What if we don't build it?

Building Hybrid Search with RRF will ensure better accuracy, more relevant results inside Llama-Stack's current sqlite vector DB implementation.

Other thoughts

No response

@varshaprasad96 varshaprasad96 added the enhancement New feature or request label Feb 19, 2025
@varshaprasad96
Copy link
Author

cc: @franciscojavierarceo

@franciscojavierarceo
Copy link
Contributor

We could probably make the choice between these configurable.

@varshaprasad96
Copy link
Author

/assign @varshaprasad96

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants