-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bump languagetool to 6.4 #492
Conversation
b5c6434
to
b784560
Compare
b784560
to
eb960f4
Compare
…uble quotes around result word) message aborts the commit. # # On branch rm/bump-languagetool # Your branch is up to date with 'origin/rm/bump-languagetool'. # # Changes to be committed: # modified: apps/checker/test/scala/matchers/LanguageToolMatcherTest.scala # # Changes not staged for commit: # modified: project/plugins.sbt # modified: script/citest # modified: script/start-manager # # Untracked files: # cdk/package-lock.json # output.dict #
…third Language parameter (now required)
169f8ed
to
55636d2
Compare
It looks like this is because the 6.6 versions are in a new dedicated repository, based on the mvnrepository page for 6.6.9. Maven Central agrees with sbt that the newest version there is 6.5. Do we want to use languagetool’s apparent dedicated repository? Is that a thing we generally do? (I suppose it means we don’t get the same third-party checking we get from Maven Central, but I don’t really know the details of what we get from them.) I tried looking at their developer docs, but didn’t find any explanation there about the new repository. |
@@ -218,7 +218,7 @@ class LanguageToolMatcherTest extends AsyncFlatSpec with Matchers { | |||
val eventuallyMatches = instance.check(request) | |||
|
|||
val expectedMatchMessages = | |||
List("Did you mean <suggestion>fewer</suggestion>? The noun tests is countable.") | |||
List("Did you mean <suggestion>fewer</suggestion>? The noun \"tests\" is countable.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m surprised this match text requires quotes but the one below (The noun mistakes…
) doesn’t. What’s the difference?
A quick note to say that we rely on some languagetool rules which may have changed behaviour between releases. You can see the full list by filtering rules by 'LT Built-in' in the rule manager — There's a script here to check the difference between old and new rules at I don't think this is necessarily blocking, as our users will let us know if something's obviously not right, but it might be nice to get ahead of any changes before they land in PROD. |
Thanks, will test this |
I have deployed this to CODE and tested it in this article. I've checked a number of built-in, regex, and dictionary rules, which appear to work correctly (though it would be good to take a look with someone who's more familiar with typerighter). I have had a hard time triggering some specific dictionary rules, e.g. 1,2-dihydroxypropane. But several appear to have been triggered correctly, so maybe this is fine. |
I ran the script and got a slightly inconclusive answer. I downloaded the english grammar file the script points to from v6.0 and from v6.4, wrote a rules.txt based on the LT built-in rules listed in typerighter-CODE, and then ran the script: yarn
node compare-rule-xml.js grammar-6.0.xml grammar-6.4.xml rules.txt Which gave this output:
A few of the rules (UPPERCASE_SENTENCE_START, WHITESPACE_RULE, SENTENCE_WHITESPACE, and DOUBLE_PUNCTUATION) were not found in either file, which suggests they might be somewhere else? I haven't been able to locate a couple of them in xml files by searching the languagetool repository (they do appear in java files, which I assume this script can't check?). |
I think I have been able to test each of those rules in my test article and have seen them correctly flagged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, but it would be good to discuss on a call before merging, to make sure I have tested it well enough.
That's spot on, @emdash-ie, some rules are defined as Java classes in the library. Thanks for checking the others — that gives us some confidence they're working correctly. If the tool is parsing existing user-defined rules and the builtins are working as expected, or unchanged, I think this is fine to merge, but very happy to join a call if you'd like to go over this in more detail. |
Seen on Rule Manager (merged by @rhystmills 9 minutes and 43 seconds ago) Please check your changes! |
Overdue on Checker (merged by @rhystmills 15 minutes and 3 seconds ago) What's gone wrong? |
What does this change?
This bumps
languagetool
to version 6.4. There are a few high priority vulnerabilities in our sbt dependency, and some of them are sub-dependencies oflanguagetool
.The newest
languagetool
version according to maven is 6.6, but sbt wasn't able to find it. I've only bumped to Version6.4
because6.5
introduced a fiddly-looking error (though it compiles successfully) and I'm hoping that6.4
will be sufficient to solve some of our dependency woes.languagetool
does not appear to be semantically versioned as there were breaking changes in6.1
and6.3
as well as6.5
. I've had to make a couple of small changes to address the changes in the6.1
and6.3
, namely:getRules
It's definitely worth testing this PR in CODE Composer and seeing if checks from our rule manage show up successfully!
I'd be especially keen to check some dictionary rules, because they rely on some slightly arcane workarounds, and undocumented changes in the structure of
languagetool
could affect them.How to test
I can advise on any aspects of testing Typerighter if you're not familiar with the tool, please reach out if you require some help