Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for multiple tokens? #4

Open
mollerhoj opened this issue Aug 16, 2018 · 3 comments
Open

Support for multiple tokens? #4

mollerhoj opened this issue Aug 16, 2018 · 3 comments

Comments

@mollerhoj
Copy link

mollerhoj commented Aug 16, 2018

As far as I can tell, this does not support lookup of sentences?

from sympound import sympound
from pyxdameraulevenshtein import damerau_levenshtein_distance
distancefun = damerau_levenshtein_distance

ssc = sympound(distancefun=distancefun, maxDictionaryEditDistance=3)

def test():
    print(ssc.load_dictionary("frequency_dictionary_en_82_765.txt", term_index=0, count_index=1))
    print(ssc.lookup_compound(input_string="whereis th elove hehad dated forImuch of thepast who couqdn'tread in sixthgrade and ins pired him", edit_distance_max=3))
test()
# returns:
# True
# wherewith:202893:88

(This is using the example data and sentence from the official SymSpell.cs repo)

@eroux
Copy link
Contributor

eroux commented Aug 16, 2018

The code reproduces the C# code, at least the core system. So either there is a bug somewhere or this is a functionality that is in another C# file that hasn't been ported... taking a closer look, I think the Python code is missing this function. Do you have some experience with Python and can you make a pull request? This shouldn't bee too difficult I think (the variables are named in a similar way between the Python and C# codes).

@MukhtarShaima
Copy link

Does the dictionary without word-freq count works in symspell ?
My dict has a unique values,and so i don't have freq count.
when i try to work with this dict,the lookup_compound is not giving me any suggestions,it just returns the same value with 'string:0:0'
here is the code,

from sympound import sympound

from jellyfish import levenshtein_distance

distancefun = levenshtein_distance
ssc = sympound(distancefun=distancefun, maxDictionaryEditDistance=3)

def test():
print(ssc.load_dictionary("symspelldict.txt", term_index=0, count_index=1))
print(ssc.lookup_compound(input_string="سعلوچا", edit_distance_max=3))
test()

@eroux
Copy link
Contributor

eroux commented Feb 24, 2019

@MukhtarShaima can you please open a separate issue for that? It doesn't look related to this one...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants