Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement request: be able to do hubness analysis with different metrics #68

Open
ivan-marroquin opened this issue Jun 8, 2021 · 5 comments

Comments

@ivan-marroquin
Copy link

Hi,

From issue , I learned that the package should be able to conduct hubness analysis with several metrics (including fractional norms).

So, I tried to use a fractional norm with the following code:

from skhubness.data import load_dexter
from skhubness import Hubness
hub= Hubness(k= 10, return_value= 'all', metric= 'minkowski', algorithm= 'hnsw', algorithm_params= {'p': 0.1}, hubness= 'local_scaling', random_state= 1969, n_jobs= -1)
hub.fit(X)

which gave the error below:

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\IMarroquin\Downloads\Important_Python_Libraries_VisualBuildTools\scikit-hubness-master\skhubness\analysis\estimation.py", line 283, in fit
raise ValueError(f"Unknown metric '{metric}'. "
ValueError: Unknown metric 'minkowski'. Must be one of ['euclidean', 'cosine', 'precomputed'].

According to documentation of nmslib, this package is able to support several metrics (including fractional norms).

I think it will be beneficial to run hubness analysis with the choice of metric.

Thanks,

Ivan

@ivan-marroquin
Copy link
Author

Here is the link to the issue I mentioned above #67

@VarIr
Copy link
Owner

VarIr commented Jun 8, 2021

IIRC, nmslib's HNSW does not support any metric besides Eucl and cos, but please feel free to point me to documentation that states otherwise.

However, this code seems to fail on a check in skhubness that might not be necessary at this point. It would also fail for algorithm="brute" which it shouldn't... Would need to look into this in detail.

For a work-around, you could calculate fractional distances ahead of time, and use metric="precomputed".

@ivan-marroquin
Copy link
Author

Hi @VarIr ,

Thanks for the prompt answer. With respect the documentation of nmslib on distances: https://github.com/nmslib/nmslib/blob/master/manual/spaces.md

I will try the proposed workaround.

Ivan

@VarIr
Copy link
Owner

VarIr commented Jun 9, 2021

Indeed, while optimized indices are only available for Eucl and cos, many more spaces are supported in general.

For personal reference, the detailed list on supported spaces is available in the manual, Table 1, p. 5.

@ivan-marroquin
Copy link
Author

Thanks for sharing the document

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants