Suggestions on implementing multi-scale quantization #3594

anfatima · 2024-04-30T17:43:12Z

anfatima
Apr 30, 2024

Summary

Is multiscale quantization (https://papers.nips.cc/paper_files/paper/2017/hash/b6617980ce90f637e68c3ebe8b9be745-Abstract.html) supported? I have been reading the FAISS code, but so far it seems like that it is not supported and there doesn't seem to be a straightforward way to code it in Python without affecting performance significantly.

Any suggestions on the fastest way to add support for it (if it is not supported)? Are there alternative solutions that deal with the problem of large variances in the norms of the data point? If it is not supported, why not?

Platform

Faiss version: 1.7.4

Running on:

CPU
GPU

Interface:

C++
Python

mdouze · 2024-05-06T10:15:36Z

mdouze
May 6, 2024
Collaborator

Yes it would be interesting to try it out.
What's weird in the paper is that the experiments are performed on Deep1M and SIFT1M, that are both normalized datasets, so the justification of multiscale quantization is not convincing.

0 replies

anfatima · 2024-05-07T15:10:34Z

anfatima
May 7, 2024
Author

They say that a large variance between the norms affects the retrieval performance, so it is more of an argument about how large variances degrade the codebook performance.

Also, another paper on assessing the performance of compressed embeddings (https://arxiv.org/pdf/1909.01264) advocates for uniform quantization compared to K-means as a coarse quantizer. So, I was wondering if uniformly quantizing the scalar component (in multiscale) and then using that bucket to get the quantized residual could lead to better retrieval.

I will test it out with what is currently available in FAISS and if it leads to improvement in retrieval, will open a thread on how to implement it in FAISS for better runtime performance.

Thanks!

0 replies

mdouze · 2024-05-10T14:27:45Z

mdouze
May 10, 2024
Collaborator

Sure. NB that many clustering variants can be implemented in python without much performance impact, see eg. the k-means implementation in

https://github.com/facebookresearch/faiss/blob/main/contrib/clustering.py#L330

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Suggestions on implementing multi-scale quantization #3594

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Suggestions on implementing multi-scale quantization #3594

anfatima Apr 30, 2024

Summary

Platform

Replies: 3 comments

mdouze May 6, 2024 Collaborator

anfatima May 7, 2024 Author

mdouze May 10, 2024 Collaborator

anfatima
Apr 30, 2024

mdouze
May 6, 2024
Collaborator

anfatima
May 7, 2024
Author

mdouze
May 10, 2024
Collaborator