-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weaviate embeddings rework #77
Conversation
…mb. processing methods
examples/utils/torch_utils.py
Outdated
return model | ||
|
||
|
||
def load_model_resnet50() -> nn.Module: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nicer to merge these two functions together:
def load_model_resnet50(discard_last_layer: bool):
...
I'll review later but it would be nice to add type hints to all the function definitions - I noticed them occasionally but not consistently. |
@@ -83,7 +78,7 @@ def _plot_kde(xs, s, density, maxima, minima): | |||
plt.show() | |||
|
|||
|
|||
def kde_peaks(data, bandwidth="scott", plot=False): | |||
def kde_peaks(data: np.ndarray, bandwidth: Union[str, float] = "scott", plot: bool = False) -> tuple[np.ndarray, np.ndarray, int, float]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK luxonis-ml has python3.8>= requirement and for python3.8 tuple
type annotation should be switched with Tuple
or it will throw an error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to this, it would be nice to create at least a very dummy test case for each added module that at least imports everything, so it catches errors like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - I like the processing example!
Requesting minor change due to a typing error which breaks the import in the notebooks. And then not sure if my Weaviate issue is unique to me or not.
plot=False, | ||
): | ||
def find_similar( | ||
reference_embeddings: Union[str, List[str], List[List[float], np.ndarray]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reference_embeddings: Union[str, List[str], List[List[float], np.ndarray]], | |
reference_embeddings: Union[str, List[str], List[List[float]], np.ndarray] |
I'm not sure if this should be the type or not, but the current typing throws an error at import in the example notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get this error once I get to the weaviate section of the notebook. Not sure if it would be a simple fix or that something could be wrong in my environment. If there is additional weaviate setup besides the embeddings requirements we should try to document it somewhere.
WeaviateStartUpError: Weaviate did not start up in 5 seconds. Either the Weaviate URL http://localhost:8080/ is wrong or Weaviate did not start up in the interval given in 'startup_period'.
|
||
return img_transposed | ||
|
||
def get_image_tensors_from_LFS( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Minor): Maybe get_image_tensors_from_remote
or even get_image_tensors_from_lfs
would be a better function name
within a given set of embeddings. The removal process uses Kernel Density | ||
Estimation (KDE) on embeddings cosine similarity for optimal split, making | ||
this approach particularly suited for embeddings in high dimensional spaces. | ||
**Overview:** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use B{Overview:}
for bold text instead of asterisks, see epytext documentation for more details.
|
||
- For Weaviate-specific examples, refer to the provided code examples. | ||
- The `search_vectordb` function can be used with either Qdrant or Weaviate, depending on the provided `vectordb_api` object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, C{search_vectordb}
instead of backticks.
Pre-commit hooks passed, but pre-commit CI test still failsRan the pre-commits as described here. And got the following output when commiting:
But the CI/ call-reusable-workflow/pre-commit (pull_request) test still fails. |
The |
Test Results17 tests 17 ✅ 0s ⏱️ Results for commit 7b63170. ♻️ This comment has been updated with latest results. |
☂️ Python Coverage
Overall Coverage
New Files
Modified Files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now - the import issue is fixed
This rework contains:
Note: currently supports only cosine similarity as a distance score, as it is nicely limited between 0 and 1.