feat: check document store and retriever dimensions before calculating embeddings for all documents #7357
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related Issues
Proposed Changes:
document_store.update_embeddings()
gets called, the embeddings are calculated for all the documents and then saved to the document store. If the embedding dimension set in the document store differs from that of the retriever model, a runtime error is raised.How did you test it?
test_faiss.py
basic_faq_pipeline.py
to test the change.- Used
FAISSDocumentStore
instead ofElasticsearchDocumentStore
. I instantiated the docustore without theembedding_dim
parameter. Therefore the default value768
is set in the docustore making the embedding dimension of the retriever and docustore different.Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.