Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added config parsing check to GaNDLF task runner #908

Merged
merged 40 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
c70ef8b
added config parsing check
sarthakpati Jan 16, 2024
9a94403
updated usage
sarthakpati Jan 17, 2024
e70836f
using latest gandlf tag, and updated plan initialization command
sarthakpati Jan 17, 2024
4249d22
updated message
sarthakpati Jan 17, 2024
89d0450
renamed
sarthakpati Jan 17, 2024
daba8c3
minor rename
sarthakpati Jan 17, 2024
600f314
no need for this comment
sarthakpati Jan 17, 2024
a3c1d68
let's see if this works
sarthakpati Jan 17, 2024
9f4ba4e
updated ignore
sarthakpati Jan 17, 2024
3931c22
trying to set the paths
sarthakpati Jan 17, 2024
bbddefc
fixed paths
sarthakpati Jan 17, 2024
f15e7dc
added debug
sarthakpati Jan 17, 2024
0454e8c
added an assert
sarthakpati Jan 17, 2024
149f8cc
better check for empty dict
sarthakpati Jan 17, 2024
d1d8477
no need for this `mkdir` command
sarthakpati Jan 17, 2024
ba7526f
this is not really needed in the workflow
sarthakpati Jan 18, 2024
76ccd07
added the plan initialization in the test
sarthakpati Jan 18, 2024
a3e764a
Add default value for --gandlf_config argument
sarthakpati Jan 18, 2024
d7b970f
checking if this worked
sarthakpati Jan 18, 2024
23685af
trying using `pwd`
sarthakpati Jan 23, 2024
24c4164
should fix lint
sarthakpati Jan 23, 2024
b35c58e
check file presence
sarthakpati Jan 23, 2024
72f6c89
removed trailing whitespace
sarthakpati Jan 23, 2024
2c3fe98
checking another path
sarthakpati Jan 23, 2024
2c81661
checking copy to a known location
sarthakpati Jan 23, 2024
84e73a1
different path
sarthakpati Jan 23, 2024
9f717ad
trying something else
sarthakpati Jan 24, 2024
7790796
Fix plan initialization
psfoley Feb 1, 2024
656f15d
Merge pull request #2 from psfoley/fix_plan_initialization
sarthakpati Feb 1, 2024
9735588
Fix plan initialization for GaNDLF
psfoley Feb 1, 2024
70c5a7b
Attempt to add missing param
psfoley Feb 1, 2024
095c6b7
better way to initialize default
sarthakpati Feb 2, 2024
db8955d
using the 3d patch instead of the default 2d one
sarthakpati Feb 2, 2024
3634fe9
lint fix
sarthakpati Feb 2, 2024
6cc1c9b
this should be there
sarthakpati Feb 2, 2024
915bf2e
lint fix
sarthakpati Feb 2, 2024
9155f32
this should fix it
sarthakpati Feb 2, 2024
8f2a74b
using 2d data for unit test instead of 3d
sarthakpati Feb 2, 2024
25c91b2
trying something else
sarthakpati Feb 2, 2024
30d48aa
added a few comments
sarthakpati Feb 2, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 39 additions & 15 deletions .github/workflows/fets-challenge.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: FeTS Challenge TaskRunner
name: GaNDLF TaskRunner

on:
pull_request:
Expand All @@ -26,25 +26,49 @@ jobs:
python -m pip install --upgrade pip
pip install torch==2.1.0+cpu torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
pip install .
- name: Setup FeTS Challenge Prerequisites
uses: actions/checkout@master
with:
repository: MLCommons/GaNDLF
ref: master
fetch-depth: 1
path: fets_challenge
- name: FeTS Challenge Task Runner Test
- name: Install GaNDLF
run: |
git clone https://github.com/MLCommons/GaNDLF.git ./gandlf
cd gandlf
git fetch --tags
echo "Checkout the latest GaNDLF tag"
latestTag=$(git describe --tags "$(git rev-list --tags --max-count=1)")
git checkout $latestTag
- name: GaNDLF Task Runner Test
run: |
cd fets_challenge
cd gandlf
pwd
pip install -e .
pip uninstall onnx -y
# Download data and Split CSVs into training and validation
cat ./GANDLF/version.py
echo "Download data and Split CSVs into training and validation"
python -c "from testing.test_full import test_generic_download_data, test_generic_constructTrainingCSV; test_generic_download_data(); test_generic_constructTrainingCSV()"
head -n 1 testing/data/train_3d_rad_segmentation.csv > /home/runner/work/openfl/openfl/valid.csv
tail -n +9 testing/data/train_3d_rad_segmentation.csv >> /home/runner/work/openfl/openfl/valid.csv
head -n 8 testing/data/train_3d_rad_segmentation.csv > /home/runner/work/openfl/openfl/train.csv
head -n 1 testing/data/train_2d_rad_segmentation.csv > /home/runner/work/openfl/openfl/valid.csv
tail -n +9 testing/data/train_2d_rad_segmentation.csv >> /home/runner/work/openfl/openfl/valid.csv
head -n 8 testing/data/train_2d_rad_segmentation.csv > /home/runner/work/openfl/openfl/train.csv
cp testing/config_segmentation.yaml /home/runner/work/openfl/openfl/config_segmentation.yaml
echo "DEBUG display the config file"
cat /home/runner/work/openfl/openfl/config_segmentation.yaml
echo "Initialize OpenFL plan"
## from docs
export WORKSPACE_TEMPLATE=gandlf_seg_test
export WORKSPACE_PATH=./my_federation
fx workspace create --prefix ${WORKSPACE_PATH} --template ${WORKSPACE_TEMPLATE}
cd ${WORKSPACE_PATH}
mkdir ./data/one
mkdir ./data/two
cp /home/runner/work/openfl/openfl/*.csv ./data/one/
cp /home/runner/work/openfl/openfl/*.csv ./data/two/
## from docs
# fx plan initialize --gandlf_config ../testing/config_segmentation.yaml
cd /home/runner/work/openfl/openfl
ls
python -m tests.github.test_gandlf --template gandlf_seg_test --fed_workspace aggregator --col1 one --col2 two --rounds-to-train 1
file "/home/runner/work/openfl/openfl/config_segmentation.yaml"
## for 2d data, only a single change is needed in the gandlf config
sed -i 's/# n_channels: 3/num_channels: 3/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
## for 3d data, the following changes are needed in the gandlf config -- commented out for now
# sed -i 's/dimension: 2/dimension: 3/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
# sed -i 's/0,255/0,1/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
# sed -i 's/128,128/32,32,32/g' "/home/runner/work/openfl/openfl/config_segmentation.yaml"
python -m tests.github.test_gandlf --template gandlf_seg_test --fed_workspace aggregator --col1 one --col2 two --rounds-to-train 1 --gandlf_config "/home/runner/work/openfl/openfl/config_segmentation.yaml"

4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,6 @@ venv/*
*.jpg
*.crt
*.key
.eggs
.eggs
eggs/*
*.pyi
2 changes: 2 additions & 0 deletions openfl/federated/plan/plan.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,8 @@ def parse(plan_config_path: Path, cols_config_path: Path = None,
extra={'markup': True})

gandlf_config = Plan.load(Path(gandlf_config_path))
# check for some defaults
gandlf_config['output_dir'] = gandlf_config.get('output_dir', '.')
plan.config['task_runner']['settings']['gandlf_config'] = gandlf_config

plan.authorized_cols = Plan.load(cols_config_path).get(
Expand Down
8 changes: 8 additions & 0 deletions openfl/federated/task/runner_gandlf.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
from GANDLF.compute.generic import create_pytorch_objects
from GANDLF.compute.training_loop import train_network
from GANDLF.compute.forward_pass import validate_network
from GANDLF.parseConfig import parseConfig


class GaNDLFTaskRunner(TaskRunner):
Expand All @@ -37,6 +38,8 @@ def __init__(
"""
super().__init__(**kwargs)

assert bool(gandlf_config), "gandlf_config must be specified"

# allow pass-through of a gandlf config as a file or a dict

train_csv = self.data_loader.train_csv
Expand All @@ -45,6 +48,11 @@ def __init__(
if isinstance(gandlf_config, str) and os.path.exists(gandlf_config):
gandlf_config = yaml.safe_load(open(gandlf_config, "r"))

try:
gandlf_config = parseConfig(gandlf_config)
except Exception:
self.logger.info("WARNING: GANDLF.parseConfig did not work as expected.")

(
model,
optimizer,
Expand Down
20 changes: 12 additions & 8 deletions openfl/interface/plan.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,8 @@ def initialize(context, plan_config, cols_config, data_config,
plan_config = Path(plan_config).absolute()
cols_config = Path(cols_config).absolute()
data_config = Path(data_config).absolute()
if gandlf_config is not None:
gandlf_config = Path(gandlf_config).absolute()

plan = Plan.parse(plan_config_path=plan_config,
cols_config_path=cols_config,
Expand All @@ -79,7 +81,6 @@ def initialize(context, plan_config, cols_config, data_config,
# exit('You must specify either a feature
# shape or authorized collaborator
# list in order for the script to determine the input layer shape')
print(plan.cols_data_paths)

collaborator_cname = list(plan.cols_data_paths)[0]

Expand All @@ -105,23 +106,26 @@ def initialize(context, plan_config, cols_config, data_config,

utils.dump_proto(model_proto=model_snap, fpath=init_state_path)

plan_origin = Plan.parse(plan_config, resolve=False).config
plan_origin = Plan.parse(plan_config_path=plan_config,
gandlf_config_path=gandlf_config,
resolve=False)

if (plan_origin['network']['settings']['agg_addr'] == 'auto'
if (plan_origin.config['network']['settings']['agg_addr'] == 'auto'
or aggregator_address):
plan_origin['network']['settings']['agg_addr'] = aggregator_address or getfqdn_env()
plan_origin.config['network']['settings']['agg_addr'] = aggregator_address or getfqdn_env()

logger.warn(f'Patching Aggregator Addr in Plan'
f" 🠆 {plan_origin['network']['settings']['agg_addr']}")
f" 🠆 {plan_origin.config['network']['settings']['agg_addr']}")

Plan.dump(plan_config, plan_origin)
Plan.dump(plan_config, plan_origin.config)

plan.config = plan_origin
if gandlf_config is not None:
Plan.dump(plan_config, plan_origin.config)

# Record that plan with this hash has been initialized
if 'plans' not in context.obj:
context.obj['plans'] = []
context.obj['plans'].append(f'{plan_config.stem}_{plan.hash[:8]}')
context.obj['plans'].append(f'{plan_config.stem}_{plan_origin.hash[:8]}')
logger.info(f"{context.obj['plans']}")


Expand Down
7 changes: 6 additions & 1 deletion tests/github/test_gandlf.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ def exec(command, directory):
parser.add_argument('--rounds-to-train')
parser.add_argument('--col1-data-path', default='data/one')
parser.add_argument('--col2-data-path', default='data/two')
parser.add_argument('--gandlf_config', default=None)
parser.add_argument('--ujjwal', action='store_true')

origin_dir = Path().resolve()
Expand All @@ -49,7 +50,11 @@ def exec(command, directory):
if re.match(r'.*\.csv$', entry.name):
shutil.copy(entry.path, Path.cwd().resolve() / 'data' / col1)
# Initialize FL plan
check_call(['fx', 'plan', 'initialize', '-a', fqdn])
if args.gandlf_config:
check_call(['fx', 'plan', 'initialize', '-a', fqdn,
'--gandlf_config', str(args.gandlf_config)])
else:
check_call(['fx', 'plan', 'initialize', '-a', fqdn])
plan_path = Path('plan/plan.yaml')
try:
rounds_to_train = int(rounds_to_train)
Expand Down
Loading