-
Notifications
You must be signed in to change notification settings - Fork 888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add Milvus vectorDB #1171
base: main
Are you sure you want to change the base?
Conversation
Hi @zc277584121! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zc277584121 Thank you! @franciscojavierarceo given your prior experience on this, would you like to help review this PR?
yeah of course! |
@@ -25,7 +25,7 @@ We are working on adding a few more APIs to complete the application lifecycle. | |||
|
|||
The goal of Llama Stack is to build an ecosystem where users can easily swap out different implementations for the same API. Examples for these include: | |||
- LLM inference providers (e.g., Fireworks, Together, AWS Bedrock, Groq, Cerebras, SambaNova, etc.), | |||
- Vector databases (e.g., ChromaDB, Weaviate, Qdrant, FAISS, PGVector, etc.), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also update the docs section as well under https://github.com/meta-llama/llama-stack/tree/main/docs/source/providers/vector_io/
See this PR as a reference #1195
|
||
|
||
async def get_adapter_impl(config: MilvusVectorIOConfig, deps: Dict[Api, ProviderSpec]): | ||
from .milvus import MilvusVectorIOAdapter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the inline and remote implementations are equivalent.
If that is, indeed, the case can you follow the pattern that's done by Chroma?
In short, they just import the remote implementation in the __init__.py
and don't have an inline chromadb.py
file.
|
||
data = [] | ||
for i, (chunk, embedding) in enumerate(zip(chunks, embeddings, strict=False)): | ||
chunk_id = f"{chunk.metadata['document_id']}:chunk-{i}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably better to generate the chunk ID from the text and the document ID in case the order of insertion changes. See this function: https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/vector_io/sqlite_vec/sqlite_vec.py#L234
Maybe that can be moved to a utils folder.
"chunk_content": chunk.model_dump(), | ||
} | ||
) | ||
self.client.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have some error handling and log the collection name if insertion fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this! Very excited to see Milvus. Left some initial feedback, could you make some changes?
Also, could you update your PR description with the output of your test run?
f"Chunk length {len(chunks)} does not match embedding length {len(embeddings)}" | ||
) | ||
if not self.client.has_collection(self.collection_name): | ||
self.client.create_collection(self.collection_name, dimension=len(embeddings[0]), auto_id=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know bounded staleness is the the OOTB consistency configuration. It may be useful to make that explicit. Alternatively, I could see strong consistency being a useful default.
sure, I will fix them soon |
Signed-off-by: ChengZi <[email protected]>
Signed-off-by: ChengZi <[email protected]>
@franciscojavierarceo I have updated this PR to solve all the problems you mentioned, please check them again, thanks. |
Have updated the PR description and upload my end2end test logs into it, please check it. |
What does this PR do?
feat: add Milvus vectorDB
note: I use the MilvusClient to implement it instead of AsyncMilvusClient, because when I tested AsyncMilvusClient, it would raise issues about evenloop, which I think AsyncMilvusClient SDK is not robust enough to be compatible with llama_stack framework.
Test Plan
have passed the unit test and ene2end test
Here is my end2end test logs, including the client code, client log, server logs from inline and remote settings
test_end2end_logs.zip