Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upload form (web) is not able to handle non-Latin characters/accents #166

Open
MeilinaR opened this issue Mar 22, 2021 · 2 comments
Open
Milestone

Comments

@MeilinaR
Copy link

basic_latin

When trying to upload a dataset whose authors/description contain any mention of a non-Latin character (in this case, it was Turkish for the divorce_predictors dataset found here) the following error occurs. This can be solved by replacing the characters for the time being, which are often found in either the description/author names.

@prabhant
Copy link
Contributor

Seems like some turkish and estonian characters are not supported in basic_latin128
@joaquinvanschoren @janvanrijn

@joaquinvanschoren
Copy link
Contributor

I discussed this with Jan. We think that the description field could be relaxed to allow UTF-8.
Could you look at that? It would require updating the data_upload XSD and thoroughly testing whether this works well.

@PGijsbers PGijsbers added this to the v2.1+ milestone Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants