-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not able to search for just numbers in lunr.de #66
Comments
My Russian docs have the same problem (I use mkdosc, if it have mean) |
Related to the trimmer. If I remove the trimmer completely, it works. The defined word character defined in line 74 were really strange: lunr.de.wordCharacters = "A-Za-z\xAA\xBA\xC0-\xD6\xD8-\xF6\xF8-\u02B8\u02E0-\u02E4\u1D00-\u1D25\u1D2C-\u1D5C\u1D62-\u1D65\u1D6B-\u1D77\u1D79-\u1DBE\u1E00-\u1EFF\u2071\u207F\u2090-\u209C\u212A\u212B\u2132\u214E\u2160-\u2188\u2C60-\u2C7F\uA722-\uA787\uA78B-\uA7AD\uA7B0-\uA7B7\uA7F7-\uA7FF\uAB30-\uAB5A\uAB5C-\uAB64\uFB00-\uFB06\uFF21-\uFF3A\uFF41-\uFF5A"; Translates to: ʸˠ Potential solution: lunr.de.wordCharacters = "A-Za-züÜÄäÖöß0-9"; |
I noticed the German support was also breaking |
I was facing the same issue No results for numeric searches. Then I found that adding '\0-9' at the end of line 74 that will include numeric searching.
I think it could be a config option in the future. |
As of ES6 regexp in JavaScript now supports the unicode flag, so pretty sure this can be used to simplify the trimmer function for all languages when creating the search index. Some of the language implementations seem to use the trimmer during search too, so it may not work for that. @MihaiValentin I can put this into a PR if you like, but obviously being ES6 it is probably not as backwards compatible as what is currently there |
Yes, a lot of the trimmers have this problem (French one too). You can sometimes get away with just replacing the language specific one with the default one but as noted above |
In #115 this is fixed in a more systematic way than mentioned above, by using the Unicode definitions, so try that out if you like. |
Probem
In my German and English test documents I have content with the term Port 1234, but searching for 1234 does not work.
Has someone seen the same or a similar problem? Any ideas?
More tests
this.use(lunr.de)
makes it possible to find German umlauts but no numbers anymore.Test Code
Result
The text was updated successfully, but these errors were encountered: