You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current indexer uses the full set of vectors to generate the trees. The advantage is that the split planes represent the vectors in the database. The cons is that it takes a lot of memory, even if it is memory-mapped.
So, to reduce the memory used to index the vectors, we should only use a subset of the vectors to generate the tree and then add the other vectors, those not used to create the planes, back into the tree. With or without refining the planes, I don't know, but probably without. Those vectors can be added by using incremental insertion.
The text was updated successfully, but these errors were encountered:
Here's the list of steps we're going to take to solve this issue:
Reproduce the issue
Try multiple strategies to mitigate the issue by reducing the set of items used in two_means. From the easiest one to the most complex one:
Naively select a random set of candidates => We might not have as many vectors as we want; we should log that
Load a contiguous set of candidates (ex: all id between rand and rand+nb_vectors_that_fits_in_RAM) => Might decrease the relevancy
Load a random set of vectors, and for each vector, load all the contiguous vectors on the same page => This should increase the number of vectors we can load without increasing the RAM or read consumption
If these mitigation are not enough we need to meet up again and there may be another solution by doing multiple incremental update instead of one large update
The current indexer uses the full set of vectors to generate the trees. The advantage is that the split planes represent the vectors in the database. The cons is that it takes a lot of memory, even if it is memory-mapped.
So, to reduce the memory used to index the vectors, we should only use a subset of the vectors to generate the tree and then add the other vectors, those not used to create the planes, back into the tree. With or without refining the planes, I don't know, but probably without. Those vectors can be added by using incremental insertion.
The text was updated successfully, but these errors were encountered: