Replies: 3 comments
-
Yes it would be interesting to try it out. |
Beta Was this translation helpful? Give feedback.
-
They say that a large variance between the norms affects the retrieval performance, so it is more of an argument about how large variances degrade the codebook performance. Also, another paper on assessing the performance of compressed embeddings (https://arxiv.org/pdf/1909.01264) advocates for uniform quantization compared to K-means as a coarse quantizer. So, I was wondering if uniformly quantizing the scalar component (in multiscale) and then using that bucket to get the quantized residual could lead to better retrieval. I will test it out with what is currently available in FAISS and if it leads to improvement in retrieval, will open a thread on how to implement it in FAISS for better runtime performance. Thanks! |
Beta Was this translation helpful? Give feedback.
-
Sure. NB that many clustering variants can be implemented in python without much performance impact, see eg. the k-means implementation in https://github.com/facebookresearch/faiss/blob/main/contrib/clustering.py#L330 |
Beta Was this translation helpful? Give feedback.
-
Summary
Is multiscale quantization (https://papers.nips.cc/paper_files/paper/2017/hash/b6617980ce90f637e68c3ebe8b9be745-Abstract.html) supported? I have been reading the FAISS code, but so far it seems like that it is not supported and there doesn't seem to be a straightforward way to code it in Python without affecting performance significantly.
Any suggestions on the fastest way to add support for it (if it is not supported)? Are there alternative solutions that deal with the problem of large variances in the norms of the data point? If it is not supported, why not?
Platform
Faiss version: 1.7.4
Running on:
Interface:
Beta Was this translation helpful? Give feedback.
All reactions