-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lunr.de demo with unexpected result for umlauts #41
Comments
I also noticed this. In #66 the approach of replacing:
Fixes wildcard support. |
workaround with
did not work. |
I also wound up changing approaches. I can dig up my code, but I believe what I did was:
|
Here it is: I basically create a mirror search index without international characters so the user gets success if they use // receive a set of text and replace diacritics
// it's a poor man's multi-lingual
function normalizeText(searchIndex) {
function replaceCharacters(string) {
var string = string || "";
// handle some common international string as fuzzy english
string = string.replace(/\u00c4/g, "A");
string = string.replace(/\u00dc/g, "U");
string = string.replace(/\u00d6/g, "O");
string = string.replace(/\u00fc/g, "u");
string = string.replace(/\u00e4/g, "a");
string = string.replace(/\u00f6/g, "o");
string = string.replace(/\u00df/g, "s");
string = string.replace(/ae/g, "a");
string = string.replace(/ue/g, "u");
string = string.replace(/oe/g, "o");
string = string.replace(/ss/g, "s");
string = string.replace(/á/g, "a");
return string;
}
for (const item in searchIndex) {
if (Object.hasOwnProperty.call(searchIndex, item)) {
searchIndex[item].multilingualAlternate = replaceCharacters(searchIndex[item].lastName);
searchIndex[item].multilingualAlternate += " " + replaceCharacters(searchIndex[item].firstName);
}
}
return searchIndex;
} I'm sure it's terrible for performance, but for our use case the dataset was small enough that it didn't matter. |
@khawkins98 |
I tried your demosite "demo-browser-require.html", but I don't understand the results.
tests:
console.log('Search for
günstige
: ', idx.search('günstige'));// expected resultsize: 1, result: 1console.log('Search for
günstig*
: ', idx.search('günstig*'));// expected resultsize: 1, result: 0console.log('Search for
g*nstig*
: ', idx.search('g*nstig*'));// expected resultsize: 1, result: 1source: https://rawgit.com/MihaiValentin/lunr-languages/master/demos/demo-browser-require.html
Did I missunderstood, how to search for words with umlauts, or is it not possible to search with wildcards for words with umlauts?
The text was updated successfully, but these errors were encountered: