Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added to build properties #632

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

mudiagaobrikisil
Copy link
Collaborator

@mudiagaobrikisil mudiagaobrikisil commented Feb 20, 2025

Added and tested fineTuneProgress and InferenceProgress to the build object. I updated the UpdateStatusAsync test to test it.


This change is Reviewable

@johnml1135
Copy link
Collaborator

Note: The tests are failing - please review.

@johnml1135
Copy link
Collaborator

src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 145 at r1 (raw file):

                            if (task.Runtime.TryGetValue("finetune", out string? fineTuneStr))
                                fineTuneProgress = int.Parse(fineTuneStr, CultureInfo.InvariantCulture) / 100.0;
                            if (task.Runtime.TryGetValue("inference", out string? inferenceStr))

We will need to verify the naming and integration with Machine.py before releasing.

@johnml1135
Copy link
Collaborator

src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 173 at r1 (raw file):

                                    task.LastIteration ?? 0,
                                    percentCompleted: 1.0,
                                    fineTuneProgress: 0.5,

"complete" would be 1.0, not 0.5.

@johnml1135
Copy link
Collaborator

src/Serval/test/Serval.Translation.Tests/Services/PlatformServiceTests.cs line 95 at r1 (raw file):

        Assert.That(env.Builds.Get("b0").PercentCompleted, Is.EqualTo(0.5));
        Assert.That(env.Builds.Get("b0").FineTuneProgress, Is.EqualTo(0.25));
        Assert.That(env.Builds.Get("b0").InferenceProgress, Is.EqualTo(0.3));

Were you able to confirm that the values are updated in manual test?

Copy link
Collaborator

@johnml1135 johnml1135 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @mudiagaobrikisil)

Copy link
Contributor

@ddaspit ddaspit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 9 files at r1, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @mudiagaobrikisil)


src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 101 at r1 (raw file):

                            step: 0,
                            percentCompleted: 0.0,
                            fineTuneProgress: 0.0,

ProgressStatus is a general-purpose model from Machine, so we shouldn't add Serval-specific properties to it. We should create a new progress status model class. It could inherit from ProgressStatus if that makes sense.


src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 24 at r1 (raw file):

    int32 step = 2;
    optional double percent_completed = 3;
    optional double fine_tune_progress = 4;

I would rename this to train_progress.


src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 25 at r1 (raw file):

    optional double percent_completed = 3;
    optional double fine_tune_progress = 4;
    optional double inference_progress = 5;

I would rename this to pretranslate_progress.


src/Serval/src/Serval.Translation/Models/Build.cs line 13 at r1 (raw file):

    public int Step { get; init; }
    public double? PercentCompleted { get; init; }
    public double? FineTuneProgress { get; init; }

I would rename this to TrainProgress.


src/Serval/src/Serval.Translation/Models/Build.cs line 14 at r1 (raw file):

    public double? PercentCompleted { get; init; }
    public double? FineTuneProgress { get; init; }
    public double? InferenceProgress { get; init; }

I would rename this to PretranslateProgress.


src/Serval/src/Serval.Translation/Models/Build.cs line 21 at r1 (raw file):

    public IReadOnlyDictionary<string, object>? Options { get; init; }
    public string? DeploymentVersion { get; init; }
    public IReadOnlyDictionary<string, string> ExecutionData { get; init; } = new Dictionary<string, string>();

@johnml1135 What do we think about putting the train and pretranslate progress in ExecutionData? It could give us more flexibility, but it wouldn't be standardized.

Copy link
Collaborator Author

@mudiagaobrikisil mudiagaobrikisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ddaspit and @johnml1135)


a discussion (no related file):

Previously, johnml1135 (John Lambert) wrote…

Note: The tests are failing - please review.

Checking it. They run very well offline so I am trying to figure out what the issue is.


src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 101 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

ProgressStatus is a general-purpose model from Machine, so we shouldn't add Serval-specific properties to it. We should create a new progress status model class. It could inherit from ProgressStatus if that makes sense.

Noted


src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 173 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

"complete" would be 1.0, not 0.5.

Thanks, done. Just used 0.5 to test but have corrected it to 1


src/Serval/src/Serval.Grpc/Protos/serval/translation/v1/platform.proto line 25 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would rename this to pretranslate_progress.

Noted


src/Serval/src/Serval.Translation/Models/Build.cs line 13 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

I would rename this to TrainProgress.

Noted


src/Serval/src/Serval.Translation/Models/Build.cs line 21 at r1 (raw file):

Previously, ddaspit (Damien Daspit) wrote…

@johnml1135 What do we think about putting the train and pretranslate progress in ExecutionData? It could give us more flexibility, but it wouldn't be standardized.

Noted

@johnml1135
Copy link
Collaborator

src/Serval/src/Serval.Translation/Models/Build.cs line 21 at r1 (raw file):

Previously, mudiagaobrikisil wrote…

Noted

ExecutionData should support it fine - I'm all for it.

@johnml1135
Copy link
Collaborator

Previously, mudiagaobrikisil wrote…

Checking it. They run very well offline so I am trying to figure out what the issue is.

The tests may be failing from the updates from Machine - Let's merge those in and then rebase- that may fix it.

@johnml1135
Copy link
Collaborator

Previously, johnml1135 (John Lambert) wrote…

The tests may be failing from the updates from Machine - Let's merge those in and then rebase- that may fix it.

Ah - It may need the Machine updated to be merged in and released first for the tests to pass...

@johnml1135
Copy link
Collaborator

src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 173 at r1 (raw file):

Previously, mudiagaobrikisil wrote…

Thanks, done. Just used 0.5 to test but have corrected it to 1

I don't see the updates pushed...

Copy link
Collaborator Author

@mudiagaobrikisil mudiagaobrikisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @ddaspit and @johnml1135)


src/Machine/src/Serval.Machine.Shared/Services/ClearMLMonitorService.cs line 173 at r1 (raw file):

Previously, johnml1135 (John Lambert) wrote…

I don't see the updates pushed...

I have not pushed yet. I will push with other updates. Was essentially waiting for the structure that Damien dropped for the Progress Status, as that will affect a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants