Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: modify failure report #240

Merged
merged 2 commits into from
Oct 25, 2023
Merged

feat: modify failure report #240

merged 2 commits into from
Oct 25, 2023

Conversation

guilhem-barthes
Copy link
Contributor

@guilhem-barthes guilhem-barthes commented Sep 7, 2023

Companion PR

Description

Change the LogsModal to get a key, allowing to retrieve logs not directly linked with a compute task

How to test

Screenshots

Notes for developers and reviewers:

Changelog updated

Copy link
Contributor

@oleobal oleobal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 👍

guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Sep 8, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/orchestrator that referenced this pull request Sep 8, 2023
## Companion PR

- Substra/substra-backend#727
- Substra/substra-frontend#240 

## Description
Modify `FailureReport`:
- add field `asset_type` containing the kind of asset the failure report
connect to
- rename `compute_task_key` to `asset_key`, which is a [wire compatible
change](https://groups.google.com/g/protobuf/c/hX4Mj0P4N0w) (i.e. does
not need to be declared as a new field)

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.


The e2e tests are also broken due to an issue on producing dumps during
release, but passed locally.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/orchestrator that referenced this pull request Sep 13, 2023
## Companion PR

- Substra/substra-backend#727
- Substra/substra-frontend#240

## Description
Modify `FailureReport`:
- add field `asset_type` containing the kind of asset the failure report
connect to
- rename `compute_task_key` to `asset_key`, which is a [wire compatible
change](https://groups.google.com/g/protobuf/c/hX4Mj0P4N0w) (i.e. does
not need to be declared as a new field)

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

The e2e tests are also broken due to an issue on producing dumps during
release, but passed locally.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
Signed-off-by: Guilhem Barthés <[email protected]>
@thbcmlowk
Copy link
Contributor

Do we need to wait for the companion PRs to be merged into their respective main before merging this?

@SdgJlbl
Copy link
Contributor

SdgJlbl commented Sep 19, 2023

Do we need to wait for the companion PRs to be merged into their respective main before merging this?

Yes, we do indeed

guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Sep 26, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Oct 6, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Oct 6, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Oct 6, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Oct 6, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
@Milouu Milouu force-pushed the feat/modify-failure-report branch from 77b033e to dfac745 Compare October 9, 2023 14:34
@thbcmlowk thbcmlowk force-pushed the feat/modify-failure-report branch from dfac745 to 73ae061 Compare October 11, 2023 13:45
Signed-off-by: Guilhem Barthes <[email protected]>
Signed-off-by: Guilhem Barthes <[email protected]>
Signed-off-by: SdgJlbl <[email protected]>
@SdgJlbl SdgJlbl force-pushed the feat/modify-failure-report branch from 73ae061 to e68e858 Compare October 20, 2023 08:45
SdgJlbl pushed a commit to Substra/orchestrator that referenced this pull request Oct 25, 2023
## Details 
- Substra/substra-backend#714

Add function events, used now we decoupled the building of the function
with the execution of the compute task. For that it add a status field
on the `Function`. It also includes another PR (merged here), to have
functions build logs working again.

Fixes FL-1160

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Companion PR

* orchestrator: #310
* backend: Substra/substra-backend#756
* frontend: Substra/substra-frontend#240
* substra-generator: owkin/substra-generator#131

## Misc
- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

## Description

<!-- Please reference issue if any. -->

<!-- Please include a summary of your changes. -->

## How has this been tested?

<!-- Please describe the tests that you ran to verify your changes.  -->

## Checklist

- [ ] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
Signed-off-by: Guilhem Barthés <[email protected]>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Signed-off-by: thbcmlowk <[email protected]>
Co-authored-by: Guilhem Barthés <[email protected]>
Co-authored-by: guilhem-barthes <[email protected]>
Co-authored-by: thbcmlowk <[email protected]>
@SdgJlbl SdgJlbl merged commit ce8fbd5 into main Oct 25, 2023
@SdgJlbl SdgJlbl deleted the feat/modify-failure-report branch October 25, 2023 13:12
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Oct 25, 2023
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Feb 8, 2024
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Feb 12, 2024
## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>
guilhem-barthes added a commit to Substra/substra-backend that referenced this pull request Feb 12, 2024
* feat: decouple image builder from worker

Signed-off-by: SdgJlbl <[email protected]>

* fix: update skaffold config

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: add `ServiceAccount` and modify role

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: build image in new pod

Signed-off-by: Guilhem Barthes <[email protected]>

* chore: rename `deployment-builder.yaml` to `stateful-builder.yaml`

Signed-off-by: Guilhem Barthes <[email protected]>

* chore: rename `stateful-builder.yaml` to `statefulset-builder.yaml`

Signed-off-by: Guilhem Barthes <[email protected]>

* chore: centralize params

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: create `BuildTask`

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: move more code to `builder`

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: remove TaskProfiling as Celery task + save Entrypoint in DB

Signed-off-by: SdgJlbl <[email protected]>

* feat: build function at registration (#707)

<!-- Please reference issue if any. -->

<!-- Please include a summary of your changes. -->

<!-- Please describe the tests that you ran to verify your changes.  -->

- [ ] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: SdgJlbl <[email protected]>
Signed-off-by: Guilhem Barthes <[email protected]>
Co-authored-by: SdgJlbl <[email protected]>

* feat: share images between backends (#708)



Signed-off-by: SdgJlbl <[email protected]>

* chore: update helm worklfow

Signed-off-by: ThibaultFy <[email protected]>

* [sub]fix: add missing migration poc (#728)

## Description

Add a migration missing in the poc. 
This migration alters two things:

-  modify `ComputeTaskFailureReport.logs` 
-  modify `FunctionImage.file`

This migration has been generated automatically with `make migrations`

## How has this been tested?

<!-- Please describe the tests that you ran to verify your changes.  -->

## Checklist

- [ ] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

Signed-off-by: Guilhem Barthes <[email protected]>

* [sub]feat: add function events (#714)

- Substra/orchestrator#263

Add function events, used now we decoupled the building of the function
with the execution of the compute task. For that it add a status field
on the Function. It also includes another PR (merged here), to have
functions build logs working again.

In a future PR, we will change the compute task execution to avoid
having to wait_for_function_built in compute_task()

Fixes FL-1160

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: SdgJlbl <[email protected]>
Signed-off-by: Guilhem Barthes <[email protected]>
Signed-off-by: Guilhem Barthés <[email protected]>
Co-authored-by: SdgJlbl <[email protected]>

* [sub]fix(app/orchestrator/resources): FunctionStatus.FUNCTION_STATUS_CREATED -> FunctionStatus.FUNCTION_STATUS_WAITING (#742)

# Issue

Backend FunctionStatus are not aligned with [orchestrator
definitions](https://github.com/Substra/orchestrator/blob/poc-decoupled-builder/lib/asset/function.proto#L29-L36).
In particular, `FunctionStatus.FUNCTION_STATUS_CREATED` leading to the
following error:

```txt
ValueError: 'FUNCTION_STATUS_WAITING' is not a valid FunctionStatus
```

## Description

FunctionStatus.FUNCTION_STATUS_CREATED ->
FunctionStatus.FUNCTION_STATUS_WAITING

## How has this been tested?

Running Camelyon benchmark on
[poc-builder-flpc](https://substra.org-1.poc-builder-flpc.cg.owkin.tech/compute_plans/a420306f-5719-412b-ab9c-688b7bed9c70/tasks?page=1&ordering=-rank)
environment.

## Checklist

- [ ] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Thibault Camalon <[email protected]>

* fix: rebase changelog

Signed-off-by: Guilhem Barthés <[email protected]>

* feat: decouple image builder from worker

Signed-off-by: SdgJlbl <[email protected]>

* feat: add `ServiceAccount` and modify role

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: build function at registration (#707)

<!-- Please reference issue if any. -->

<!-- Please include a summary of your changes. -->

<!-- Please describe the tests that you ran to verify your changes.  -->

- [ ] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: SdgJlbl <[email protected]>
Signed-off-by: Guilhem Barthes <[email protected]>
Co-authored-by: SdgJlbl <[email protected]>

* feat: save status update in orc

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: use status for build waiting

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: re-add `container_image_exists`

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: rebase errors

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: format

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: tests

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: add `si` to building invokations

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: tests

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: apply feedback

Signed-off-by: Guilhem Barthes <[email protected]>

* fix: only import during typing

Signed-off-by: Guilhem Barthes <[email protected]>

* [sub]feat: modify computetask failure report (#727)

## Companion PR

- Substra/orchestrator#277
- Substra/substra-frontend#240

## Description

The aim is to allow registering failure reports not only for compute
task but for other kind of assets (for now, functions which are not
building as part of the execution of a compute task)

- Modifies `ComputeTaskFailureReport`:
    - renamed the model to `AssetFailureReport`
- renamed field `compute_task_key` to `asset_key` (as we can now have a
function key)
    - added field `asset_type` to provide 
- Updates protobuf reflecting the previous changes
- refactor `download_file` in `PermissionMixin` to provide mroe
flexibility (and decouple from DRF)
- create new `FailableTask` (Celery task):
  - centralize the logic to submit asset failure

## How has this been tested?

As this is going to be merged on a branch that is going to be merged to
a POC branch, we use MNIST as a baseline of a working model. We will
deal with failing tests on the POC before merging on main.

## Checklist

- [x] [changelog](../CHANGELOG.md) was updated with notable changes
- [ ] documentation was updated

---------

Signed-off-by: Guilhem Barthes <[email protected]>

* feat: add config to run celery in tests

Signed-off-by: Guilhem Barthés <[email protected]>

* feat: add tests

Signed-off-by: Guilhem Barthés <[email protected]>

* fix: remove rebqse duplicate

Signed-off-by: Guilhem Barthés <[email protected]>

* docs: changelog

Signed-off-by: Guilhem Barthés <[email protected]>

* fix: adapt to pydantic 2.x.x

Signed-off-by: Guilhem Barthés <[email protected]>

* fix: remove rebase artifacts

Signed-off-by: Guilhem Barthés <[email protected]>

* fix: update to pydantic 2.x.x

Signed-off-by: Guilhem Barthés <[email protected]>

---------

Signed-off-by: SdgJlbl <[email protected]>
Signed-off-by: Guilhem Barthes <[email protected]>
Signed-off-by: ThibaultFy <[email protected]>
Signed-off-by: Guilhem Barthés <[email protected]>
Signed-off-by: Thibault Camalon <[email protected]>
Co-authored-by: SdgJlbl <[email protected]>
Co-authored-by: ThibaultFy <[email protected]>
Co-authored-by: Thibault Camalon <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants