Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

impr(britishenglish): Alphabetize britishenglish.json file (@AnnaBuchholz) #6236

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

AnnaBuchholz
Copy link
Contributor

Description

I alphabetized the britishenglish.json file. It's obviously not perfect, so I figured alphabetizing it would at least organize it a bit better.

Checks

  • Check if any open issues are related to this PR; if so, be sure to tag them below.
  • Make sure the PR title follows the Conventional Commits standard. (https://www.conventionalcommits.org for more info)
  • Make sure to include your GitHub username prefixed with @ inside parentheses at the end of the PR title.

@monkeytypegeorge monkeytypegeorge added frontend User interface or web stuff assets Languages, themes, layouts, etc. labels Feb 5, 2025
@AnnaBuchholz
Copy link
Contributor Author

I still don't know why there's a long list of previous commits from other files and branches. If anyone knows, I'd love to know why.

I found a few duplicates in the file but haven't deleted them yet. Also, is there any better way to do certain sets of words more efficiently? For example, flavor/flavour also has s/ing/ings/ed endings and so do a bunch of other similar words. Of course, when people manually add words like flavor/flavour they don't always think to add all of the other possible endings even though those derivatives are also likely in the wordlists.

@AnnaBuchholz
Copy link
Contributor Author

AnnaBuchholz commented Feb 5, 2025

Some duplication examples: civilize, emphasize, synchronize, utilization
Other weirdness: customization (same in American, but one is plural in britishenglish), tranquilize has one "l" but the other forms (tranquilliazation and tranquillized) have two "l"s.

Lastly, what is happening with the ["tire", "tyre", ["will"]], line??

@Miodec
Copy link
Member

Miodec commented Feb 6, 2025

Im not sure whats the advantange of having the list in alphabetical order? It adds more friction to adding new pairs because you have to look for the alphabetical spot intead of just adding to the bottom.

Lastly, what is happening with the ["tire", "tyre", ["will"]], line??

The third element is an array of special words that will ignore the replacement. For example tire will not be replaced to tyre if there is any of the exception words before it (this fixed an issue where will tire was replaced with will tyre, which made no gramatical sense).

@Miodec
Copy link
Member

Miodec commented Feb 6, 2025

As for the word variation, we could just do a simple "also check xs, xing, xize, xed" but it miiiiiiight create some non words and cause weird replacements.

@AnnaBuchholz
Copy link
Contributor Author

Honestly, I alphabetized it mostly because at this point it's small enough that it was still feasible. I'd be fine with whoever in the future just appending words randomly to the bottom again.

If it is alphabetized, then whenever I look for "hey does x word already get replaced in the britishenglish file or not" I can pretty easily see if other variations like xing xed xize, etc. are there or not (instead of those variations being all over the place).

Also, I'm thinking I could probably go back through the file and delete a few of the duplicates I found and such. I'd also be willing to go back through and manually add some more of the missing xing xed xize, etc. though that would likely make the file a decent bit longer than it is.

@Miodec
Copy link
Member

Miodec commented Feb 7, 2025

We could also convert this file to a TS file and export a Set instead of an array. It will help with performance and protect against duplicate entries.

@AnnaBuchholz
Copy link
Contributor Author

What all would we need to do for that? Obviously we'd need to change this particular file, but I don't know how much we'd have to change whatever other files that reference this one.

Are the spellings figured out when the test is initially loaded or throughout the test right before the words are on the screen? Would switching to TS make it noticeably different time-wise?

@Miodec
Copy link
Member

Miodec commented Feb 8, 2025

What all would we need to do for that? Obviously we'd need to change this particular file, but I don't know how much we'd have to change whatever other files that reference this one.

Move the list from the json file to the british-english.ts file, assign it to the list const and remove the getList function.

Are the spellings figured out when the test is initially loaded or throughout the test right before the words are on the screen? Would switching to TS make it noticeably different time-wise?

Both when the test initially loads and as more words are generated (if needed). At this list length the performace would probably not be noticable, but still O(1) is better than O(n).

@AnnaBuchholz
Copy link
Contributor Author

Just to double check, would I change the ts/test/british-english.ts or ts/commandline/lists/british-english.ts? I assume I'd change the first one and that you mean line 11 is where I should add the list of BritishEnglish words? Would/Should I just fully delete the britishenglish.json file as well?

@Miodec
Copy link
Member

Miodec commented Feb 10, 2025

First one. Yeah once you move the list you can remove the json file.

@AnnaBuchholz
Copy link
Contributor Author

Alright, sounds good. I'll see if I can get that working then.

@AnnaBuchholz
Copy link
Contributor Author

I just commented some stuff about json out of the ts file and added two of the conversions where I think the list should be. I'm also not sure how to do the syntax for ts or why it's giving me an error. (Should I use single quotes instead?)

Essentially, I just wanted to check if it looks right-ish so far before I add the rest of the list?

@AnnaBuchholz
Copy link
Contributor Author

Also I'm having a bit of trouble trying to run pretty in my environment (not sure that I even have it installed and I keep getting random errors) so I don't know why that check is failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
assets Languages, themes, layouts, etc. frontend User interface or web stuff
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants