-
Notifications
You must be signed in to change notification settings - Fork 215
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Test to verify memory logs (#1191)
* Typo fix Signed-off-by: Chaurasiya, Payal <[email protected]> * Add memory logs test Signed-off-by: Chaurasiya, Payal <[email protected]> * Add memory test in e2e Signed-off-by: Chaurasiya, Payal <[email protected]> * Add memory test in e2e Signed-off-by: Chaurasiya, Payal <[email protected]> * Simplify setup (#1187) * Simplify setup.py Signed-off-by: Shah, Karan <[email protected]> * Remove tensorboardX and never-used log_metric code Signed-off-by: Shah, Karan <[email protected]> * Test reducing requirements Signed-off-by: Shah, Karan <[email protected]> * Revert "Remove tensorboardX and never-used log_metric code" and add fn calls Signed-off-by: Shah, Karan <[email protected]> * Revert tb removal Signed-off-by: Shah, Karan <[email protected]> * Disable tensorboard logging for gramine CI test Signed-off-by: Shah, Karan <[email protected]> --------- Signed-off-by: Shah, Karan <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Fix for TF (2.13) cnn histology workspace 'Adam' object has no attribute 'weights' issue (#1194) Signed-off-by: yes <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * fix(deps): requests min version set to 2.32.0 (#1198) Signed-off-by: Pant, Akshay <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Pin GaNDLF version to 0.1.1 (#1179) * fix(gandlf ci): pinned torchaudio version Signed-off-by: Pant, Akshay <[email protected]> * fix(gandlf ci): install torch without cache Signed-off-by: Pant, Akshay <[email protected]> * fix(gandlf ci): upgrade torch to 2.5.0 Signed-off-by: Pant, Akshay <[email protected]> * fix(gandlf ci): pin gandlf to 0.1.1 Signed-off-by: Pant, Akshay <[email protected]> --------- Signed-off-by: Pant, Akshay <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Migrate `shell/*` to `scripts/*` (#1193) * Update distribution scripts Signed-off-by: Shah, Karan <[email protected]> * Migrate shell/ to scripts/ Signed-off-by: Shah, Karan <[email protected]> * Remove lint test from ubuntu CI Signed-off-by: Shah, Karan <[email protected]> --------- Signed-off-by: Shah, Karan <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Set timeouts for all CI workflows (#1200) * Set timeouts for all CI workflows Signed-off-by: Shah, Karan <[email protected]> * forgot to add this too Signed-off-by: Shah, Karan <[email protected]> --------- Signed-off-by: Shah, Karan <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Review comments Signed-off-by: Chaurasiya, Payal <[email protected]> * Add details in file for further use Signed-off-by: Chaurasiya, Payal <[email protected]> * Fix for tf_3dunet_barts workspace (#1197) Signed-off-by: yes <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Update task_runner_e2e.yml Signed-off-by: Chaurasiya, Payal <[email protected]> * OpenFL roadmap update (#1196) * OpenFL 1.7 roadmap update Signed-off-by: Teodor Parvanov <[email protected]> * Addressing review comments Signed-off-by: Teodor Parvanov <[email protected]> --------- Signed-off-by: Teodor Parvanov <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Update log verbosity (#1202) Signed-off-by: Shah, Karan <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Restore `openfl-tutorials` as installable package (#1203) * Add openfl-tutorials as package Signed-off-by: Shah, Karan <[email protected]> * Add __init__.py Signed-off-by: Shah, Karan <[email protected]> * Add nbformat pkg Signed-off-by: Shah, Karan <[email protected]> * Try localhost Signed-off-by: Shah, Karan <[email protected]> * Revert "Try localhost" This reverts commit 44b8304. Signed-off-by: Shah, Karan <[email protected]> * Try python3.10 Signed-off-by: Shah, Karan <[email protected]> * Try localhost Signed-off-by: Shah, Karan <[email protected]> --------- Signed-off-by: Shah, Karan <[email protected]> Signed-off-by: Chaurasiya, Payal <[email protected]> * Fix for ubuntu Signed-off-by: Chaurasiya, Payal <[email protected]> * Writing memory details in json Signed-off-by: Chaurasiya, Payal <[email protected]> * Update task_runner_e2e.yml Signed-off-by: Chaurasiya, Payal <[email protected]> * E501 fix Signed-off-by: Chaurasiya, Payal <[email protected]> * Lint changes Signed-off-by: Chaurasiya, Payal <[email protected]> --------- Signed-off-by: Chaurasiya, Payal <[email protected]> Signed-off-by: Shah, Karan <[email protected]> Signed-off-by: yes <[email protected]> Signed-off-by: Pant, Akshay <[email protected]> Signed-off-by: Teodor Parvanov <[email protected]> Co-authored-by: Karan Shah <[email protected]> Co-authored-by: Shailesh Tanwar <[email protected]> Co-authored-by: Akshay Pant <[email protected]> Co-authored-by: teoparvanov <[email protected]>
- Loading branch information
1 parent
4ac3430
commit 44614b3
Showing
9 changed files
with
268 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -230,3 +230,70 @@ jobs: | |
with: | ||
name: tr_no_client_auth_${{ env.MODEL_NAME }}_python${{ env.PYTHON_VERSION }}_${{ github.run_id }} | ||
path: result.tar | ||
|
||
test_memory_logs: | ||
name: tr_tls_memory_logs | ||
runs-on: ubuntu-22.04 | ||
timeout-minutes: 15 | ||
strategy: | ||
matrix: | ||
# Testing non TLS scenario only for torch_cnn_mnist model and python 3.10 | ||
# If required, this can be extended to other models and python versions | ||
model_name: ["torch_cnn_mnist"] | ||
python_version: ["3.10"] | ||
fail-fast: false # do not immediately fail if one of the combinations fail | ||
|
||
env: | ||
MODEL_NAME: ${{ matrix.model_name }} | ||
PYTHON_VERSION: ${{ matrix.python_version }} | ||
|
||
steps: | ||
- name: Checkout OpenFL repository | ||
id: checkout_openfl | ||
uses: actions/[email protected] | ||
with: | ||
fetch-depth: 2 # needed for detecting changes | ||
submodules: "true" | ||
token: ${{ secrets.GITHUB_TOKEN }} | ||
|
||
- name: Set up Python | ||
id: setup_python | ||
uses: actions/setup-python@v3 | ||
with: | ||
python-version: ${{ env.PYTHON_VERSION }} | ||
|
||
- name: Install dependencies | ||
id: install_dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install . | ||
pip install -r test-requirements.txt | ||
- name: Run Task Runner E2E tests without TLS | ||
id: run_tests | ||
run: | | ||
python -m pytest -s tests/end_to_end/test_suites/memory_logs_tests.py \ | ||
--model_name ${{ env.MODEL_NAME }} --num_rounds ${{ env.NUM_ROUNDS }} \ | ||
--num_collaborators ${{ env.NUM_COLLABORATORS }} --log_memory_usage | ||
echo "Task runner memory logs test run completed" | ||
- name: Print test summary | ||
id: print_test_summary | ||
if: ${{ always() }} | ||
run: | | ||
export PYTHONPATH="$PYTHONPATH:." | ||
python tests/end_to_end/utils/summary_helper.py | ||
echo "Test summary printed" | ||
- name: Tar files | ||
id: tar_files | ||
if: ${{ always() }} | ||
run: tar -cvf result.tar results | ||
|
||
- name: Upload Artifacts | ||
id: upload_artifacts | ||
uses: actions/upload-artifact@v4 | ||
if: ${{ always() }} | ||
with: | ||
name: tr_tls_memory_logs_${{ env.MODEL_NAME }}_python${{ env.PYTHON_VERSION }}_${{ github.run_id }} | ||
path: result.tar |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
# Copyright 2020-2023 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import pytest | ||
import logging | ||
import os | ||
import json | ||
|
||
from tests.end_to_end.utils import federation_helper as fed_helper | ||
|
||
log = logging.getLogger(__name__) | ||
|
||
|
||
@pytest.mark.log_memory_usage | ||
def test_log_memory_usage(request, fx_federation): | ||
""" | ||
This module contains end-to-end tests for logging memory usage in a federated learning setup. | ||
Test Suite: | ||
- test_log_memory_usage: Tests the memory usage logging functionality for the torch_cnn_mnist model. | ||
Functions: | ||
- test_log_memory_usage(request, fx_federation): | ||
Test the memory usage logging functionality in a federated learning setup. | ||
Parameters: | ||
- request: The pytest request object containing configuration options. | ||
- fx_federation: The fixture representing the federated learning setup. | ||
Steps: | ||
1. Skip the test if memory usage logging is disabled. | ||
2. Setup PKI for trusted communication if TLS is enabled. | ||
3. Start the federation and verify its completion. | ||
4. Verify the existence of memory usage logs for the aggregator. | ||
5. Verify the memory usage details for each round. | ||
6. Verify the existence and details of memory usage logs for each collaborator. | ||
7. Log the availability of memory usage details for all participants. | ||
""" | ||
# Skip test if fx_federation.log_memory_usage is False | ||
if not request.config.log_memory_usage: | ||
pytest.skip("Memory usage logging is disabled") | ||
|
||
# Setup PKI for trusted communication within the federation | ||
if request.config.use_tls: | ||
assert fed_helper.setup_pki(fx_federation), "Failed to setup PKI for trusted communication" | ||
|
||
# Start the federation | ||
results = fed_helper.run_federation(fx_federation) | ||
|
||
# Verify the completion of the federation run | ||
assert fed_helper.verify_federation_run_completion(fx_federation, results, \ | ||
num_rounds=request.config.num_rounds), "Federation completion failed" | ||
# Verify the aggregator memory logs | ||
aggregator_memory_usage_file = os.path.join(fx_federation.workspace_path, "logs", "aggregator_memory_usage.json") | ||
assert os.path.exists(aggregator_memory_usage_file), "Aggregator memory usage file is not available" | ||
|
||
# Log the aggregator memory usage details | ||
memory_usage_dict = json.load(open(aggregator_memory_usage_file)) | ||
|
||
# check memory usage entries for each round | ||
assert len(memory_usage_dict) == request.config.num_rounds, \ | ||
"Memory usage details are not available for all rounds" | ||
|
||
# check memory usage entries for each collaborator | ||
for collaborator in fx_federation.collaborators: | ||
collaborator_memory_usage_file = os.path.join(fx_federation.workspace_path, | ||
"logs", | ||
f"{collaborator.collaborator_name}_memory_usage.json") | ||
|
||
assert os.path.exists(collaborator_memory_usage_file), f"Memory usage file for collaborator {collaborator.collaborator_name} is not available" | ||
|
||
memory_usage_dict = json.load(open(collaborator_memory_usage_file)) | ||
|
||
assert len(memory_usage_dict) == request.config.num_rounds, \ | ||
f"Memory usage details are not available for all rounds for collaborator {collaborator.collaborator_name}" | ||
|
||
log.info("Memory usage details are available for all participants") |
Oops, something went wrong.