-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Language Info endpoint #283
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #283 +/- ##
==========================================
- Coverage 70.95% 70.55% -0.40%
==========================================
Files 138 141 +3
Lines 6135 6314 +179
Branches 981 1011 +30
==========================================
+ Hits 4353 4455 +102
- Misses 1335 1395 +60
- Partials 447 464 +17 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 12 files reviewed, 3 unresolved discussions (waiting on @johnml1135)
samples/EchoTranslationEngine/TranslationEngineServiceV1.cs
line 303 at r1 (raw file):
new GetLanguageInfoResponse { ResolvedLanguageCode = "hello",
Given that the Echo engine in general just spits back whatever you give in, would it be worth doing something more akin to that here. I get that it's a bit harder to echo this information since it needs to undergo transformation. If we do keep it this way, we should at least make sure this is documented since this is meant to be used for testing and someone might expect that the response language code, for example, is just the input (?).
src/Serval.Translation/Controllers/TranslationEnginesController.cs
line 1051 at r1 (raw file):
} private QueueDto Map(Queue source) => new() { Size = source.Size, EngineType = source.EngineType };
Is the goal to ultimately reimplement something very much akin to this but that allows you to specify which queue? Or am I misunderstanding? Is it just that you want to be able to specify the queue on creation? I thought one of the rationales was that this could be used to probe queue loads before adding jobs (which just specifying a queue in buildOptions
wouldn't allow you to do. If the plan is to rebuild this slightly differently, then I wonder if it's worth just doing that now in the same PR since the code will be relatively similar and not take much effort. If not, we should probably remove the corresponding code on the Machine side (?).
tests/Serval.ApiServer.IntegrationTests/TranslationEngineTests.cs
line 1424 at r1 (raw file):
} public TranslationClient CreateTranslationClient(IEnumerable<string>? scope = null)
Why make a new client creation method instead of extending the previous one?
Previously, Enkidu93 (Eli C. Lowry) wrote…
Oh, I see - nevermind! |
Previously, Enkidu93 (Eli C. Lowry) wrote…
I do still wonder if it's worth adding in that queue-specification in this PR if that is the ultimate intent. |
Same as with queueDepth above. See this message from Damien on this topic during the initial queue depth PR: #177 (review). This is why it inadvertently ended up being a POST request. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Agreed - that makes more sense. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Can you copy the comment from that review here? I don't know which one you are referring to. |
This also resolves #281. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
I added the reference to #281 above. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Am I able to? The queue and the language info are now in a separate file TranslationController.cs because we needed a different endpoint. I can merge the two files into one file called TranslationController.cs and redo all the "engines" endpoints - what do you think is best? |
Previously, johnml1135 (John Lambert) wrote…
URLs are often treated as case insensitive. In this case, just the engine type part of the URL will be case sensitive. Unfortunately, we never expected to use the engine types in a URL, so they can't easily be converted to Pascal case. We have a couple of options:
|
Previously, Enkidu93 (Eli C. Lowry) wrote…
@ddaspit - what do you think? I am inclined to do the case insensitive route... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 12 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval.Grpc/Protos/serval/translation/v1/engine.proto
line 85 at r2 (raw file):
message GetLanguageInfoRequest { string engine_type = 1; string language_code = 2;
This should be named language
to be consistent with the rest of the API.
src/Serval.Translation/Contracts/TranslationInfoDto.cs
line 5 at r2 (raw file):
public class LanguageInfoDto { public string ResolvedLanguageCode { get; set; } = default!;
I think I would prefer a different name. I don't think "resolved" is very clear in this context. Maybe something like InternalCode
.
src/Serval.Translation/Contracts/TranslationInfoDto.cs
line 6 at r2 (raw file):
{ public string ResolvedLanguageCode { get; set; } = default!; public bool NativeLanguageSupport { get; set; } = default!;
To be consistent with the rest of the API, you should use IsSupported
.
src/Serval.Translation/Contracts/TranslationInfoDto.cs
line 7 at r2 (raw file):
public string ResolvedLanguageCode { get; set; } = default!; public bool NativeLanguageSupport { get; set; } = default!; public string CommonLanguageName { get; set; } = default!;
I'm not sure what "common" means in this context. I think simply Name
should be sufficient.
src/Serval.Translation/Controllers/TranslationController.cs
line 63 at r1 (raw file):
Previously, johnml1135 (John Lambert) wrote…
@ddaspit - what do you think? I am inclined to do the case insensitive route...
I think we should go with option 1 with one modification. We still support the engine type in pascal case for existing endpoints for backwards compatibility.
src/Serval.Translation/Controllers/TranslationController.cs
line 6 at r2 (raw file):
[Route("api/v{version:apiVersion}/translation")] [OpenApiTag("Translation Engines")] public class TranslationController(IAuthorizationService authService, IEngineService engineService)
This should be a translation engine types controller. The route would be api/v{version:apiVersion}/translation/engine-types
.
src/Serval.Translation/Controllers/TranslationController.cs
line 57 at r2 (raw file):
/// <response code="503">A necessary service is currently unavailable. Check `/health` for more details. </response> [Authorize(Scopes.ReadTranslationEngines)] [HttpGet("{engineType}/language-info/{languageCode}")]
I would rename the route to languages
.
src/Serval.Translation/Controllers/TranslationController.cs
line 64 at r2 (raw file):
public async Task<ActionResult<LanguageInfoDto>> GetLanguageInfoAsync( [NotNull] string engineType, [NotNull] string languageCode,
This should be simply language
.
Previously, ddaspit (Damien Daspit) wrote…
Replaced in all locations in serval and machine. |
Previously, ddaspit (Damien Daspit) wrote…
ISOLanguage? ISO639_3Language? InternationalLanguage? |
Previously, ddaspit (Damien Daspit) wrote…
All languages are supported - how about |
Previously, ddaspit (Damien Daspit) wrote…
English, etc. They are from the table here: https://github.com/facebookresearch/flores/blob/main/flores200/README.md |
Previously, ddaspit (Damien Daspit) wrote…
Changed and file renamed. |
Previously, ddaspit (Damien Daspit) wrote…
All "codes" renamed. |
Previously, ddaspit (Damien Daspit) wrote…
done |
Previously, ddaspit (Damien Daspit) wrote…
Ok: 1 with Pascal used internally and backwards compatibility maintained. |
Previously, johnml1135 (John Lambert) wrote…
I see. No, I think your refactor makes sense. |
Previously, johnml1135 (John Lambert) wrote…
Where is |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Yes, it says if the language is in the NLLB200, and when MADLAD is supported, it will tell if it is in that model natively. This can be used for determining if a language is supported for source text or if you can do direct inferencing on it. |
Previously, Enkidu93 (Eli C. Lowry) wrote…
I will keep the refactor then and rename the file to TranslationEngineTypesController.cs (as per Damien's comment). |
Previously, johnml1135 (John Lambert) wrote…
ISOLanguageCode? |
Previously, johnml1135 (John Lambert) wrote…
OK, yes, makes sense. I don't see an issue with something like |
Previously, Enkidu93 (Eli C. Lowry) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything that caught my attention has been resolved
Reviewed 9 of 9 files at r3, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @ddaspit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how I did that. You don't need to re-review anything.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
Previously, ddaspit (Damien Daspit) wrote…
Hmm - then the entire endpoint would be meaningless? Would Smt engines have native support? |
Previously, ddaspit (Damien Daspit) wrote…
IsNative. I like it. |
Previously, ddaspit (Damien Daspit) wrote…
Can I update to dotnet standard 2.1? I want nullable types.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 8 of 8 files at r4, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval.Translation/Contracts/TranslationInfoDto.cs
line 5 at r2 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Hmm - then the entire endpoint would be meaningless? Would Smt engines have native support?
The main use case for the endpoint is the IsNative
property. The InternalCode
is only relevant for pretrained multilingual NMT models. For SMT, I guess everything is native.
src/Serval.Translation/Contracts/TranslationInfoDto.cs
line 7 at r2 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Can I update to dotnet standard 2.1? I want nullable types..
What do you want to update to .NET Standard 2.1? This assembly is a .NET 8 DLL, so it already supports nullable types.
Previously, ddaspit (Damien Daspit) wrote…
I think I need to update the language support to "latest" and add "nullable" - https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/configure-language-version. I don't need 2.1 - I was wrong. |
Previously, johnml1135 (John Lambert) wrote…
If we update to C# latest, we get 400+ warnings about nullable types. What should we do? Let's have a short meeting tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval.Translation/Contracts/TranslationInfoDto.cs
line 7 at r2 (raw file):
Previously, johnml1135 (John Lambert) wrote…
If we update to C# latest, we get 400+ warnings about nullable types. What should we do? Let's have a short meeting tomorrow.
I'm confused. This is already a .NET 8 project, so the default C# language version is 12, which supports nullable types.
src/Serval.Translation/Controllers/TranslationEngineTypesController.cs
line 64 at r5 (raw file):
public async Task<ActionResult<LanguageInfoDto>> GetLanguageInfoAsync( [NotNull] string engineType, [NotNull] string languageCode,
This should be language
like everything else.
Previously, ddaspit (Damien Daspit) wrote…
How about make IsNative an enum? IsNative: True|False|N/A |
Previously, ddaspit (Damien Daspit) wrote…
It's .NET 7.3 - I can show you what I am seeing. |
Previously, ddaspit (Damien Daspit) wrote…
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r6, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval.Translation/Contracts/TranslationEngineConfigDto.cs
line 33 at r6 (raw file):
{ get => _type; set { _type = Engine.ToPascalCase(value); }
It is not the responsibility of DTOs to convert to pascal case. DTOs typically do not contain any logic. The controller is responsible for interpreting the inputs and mapping them to models that the application/service layer can understand.
src/Serval.Translation/Models/Engine.cs
line 14 at r6 (raw file):
{ get => _type; set { _type = ToPascalCase(value); }
It is not the responsibility of models to convert to pascal case. Models typically do not contain any logic. The controller is responsible for interpreting the inputs and mapping them to models that the application/service layer can understand.
src/Serval.Translation/Models/Engine.cs
line 29 at r6 (raw file):
private static partial Regex SnakeCaseRegex(); public static string ToPascalCase(string name)
Use the CaseExtensions library to perform the conversion.
src/Serval.Translation/Models/LanguageInfo.cs
line 9 at r6 (raw file):
{ get => _engineType; set { _engineType = Engine.ToPascalCase(value); }
It is not the responsibility of models to convert to pascal case. Models typically do not contain any logic. The controller is responsible for interpreting the inputs and mapping them to models that the application/service layer can understand.
src/Serval.Translation/Models/Queue.cs
line 7 at r6 (raw file):
public int Size { get; set; } = default; private string _engineType = default!; public string EngineType
It is not the responsibility of models to convert to pascal case. Models typically do not contain any logic. The controller is responsible for interpreting the inputs and mapping them to models that the application/service layer can understand.
src/Serval.Translation/Controllers/TranslationEngineTypesController.cs
line 57 at r6 (raw file):
/// * **EngineType**: See above /// * **IsNative**: Whether language is in the base NLLB-200 model as per [this list](https://github.com/facebookresearch/flores/blob/main/nllb_seed/README.md) /// * **InternalCode**: The ISO 639-3 code that the language maps to according to [these rules](https://github.com/sillsdev/serval/wiki/FLORES%E2%80%90200-Language-Code-Resolution-for-NMT-Engine)
The FLORES-200 codes are not ISO 639-3 codes. They are a combination of ISO 639-3 codes and ISO 15924 codes. They do not conform to any particular standard. Also, this is only applicable to the NMT engine type using NLLB.
Previously, ddaspit (Damien Daspit) wrote…
Ok - it may be a bit more clunky, but I'll change it. |
Previously, ddaspit (Damien Daspit) wrote…
Updated comment. |
Previously, ddaspit (Damien Daspit) wrote…
I'll fix all. |
Previously, ddaspit (Damien Daspit) wrote…
Done - nice find. |
Previously, ddaspit (Damien Daspit) wrote…
Will fix all. |
Previously, ddaspit (Damien Daspit) wrote…
Will fix all. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r7, all commit messages.
Dismissed @Enkidu93 from a discussion.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval.Translation/Models/LanguageInfo.cs
line 9 at r6 (raw file):
Previously, johnml1135 (John Lambert) wrote…
Will fix all.
I think you accidentally turned this into a field instead of a property.
tests/Serval.Translation.Tests/Services/EngineModelTests.cs
line 0 at r7 (raw file):
We don't need to unit test a library. You can delete this test fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @Enkidu93 and @johnml1135)
src/Serval.Translation/Models/LanguageInfo.cs
line 8 at r7 (raw file):
public bool IsNative { get; set; } = default!; public string? InternalCode { get; set; } = default!; public string? Name { get; set; } = default!;
Is there an actual requirement for the language name from Scripture Forge? If not, we should just remove it. We don't need it and it will only make it more difficult to implement this in future engines.
Previously, ddaspit (Damien Daspit) wrote…
GAAAAAAAHHHHH! C# has so many subtle things that have big impacts! Fixed. |
Previously, ddaspit (Damien Daspit) wrote…
Hmm. I see what you mean. I will kill it at the API level but keep the guts of it there. If we do want it in the future it will be easy to add. We can also grab data from the ethnologue or similar if needed. |
Previously, ddaspit (Damien Daspit) wrote…
Your right - I forgot that I didn't need this anymore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 9 of 9 files at r8, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @Enkidu93)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 11 files at r6, 7 of 11 files at r7, 9 of 9 files at r8, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @johnml1135)
Move queue endpoint Resolve reviewer comments Updated from reviewer comments IsSupportedNatively InternalCode UpdatedNames Optional name and InternalCode Update endpoint Update controllers to kebab case - PascalCase internally Update documentation Add pascal case tests updates from reviewer comments respond to reviewer comments. Strip out name from GRPC call
dfe2459
to
602e305
Compare
Move queue endpoint
This change is