forked from securefederatedai/openfl
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Examples for use XPU with IPEX (securefederatedai#904)
* Added XPU example for Workflow Interface This commit introduces an example of using an XPU with the Workflow Interface. The example demonstrates how to leverage the power of XPU to optimize the execution of complex workflows with OpenFL * Added XPU example for non-federated_case This commit introduces an example of using an XPU with the non-federated_case. * Removed non-federated_case_XPU file * Updated Workflow_Interface_104_MNIST_XPU file * Added TinyImagenet example for interactive api and XPU * Update xpu definition and copyright * Added link for download xpu driver Signed-off-by: nammbash <[email protected]>
- Loading branch information
1 parent
9936202
commit f0bbd78
Showing
12 changed files
with
1,513 additions
and
0 deletions.
There are no files selected for viewing
735 changes: 735 additions & 0 deletions
735
openfl-tutorials/experimental/Workflow_Interface_104_MNIST_XPU.ipynb
Large diffs are not rendered by default.
Oops, something went wrong.
90 changes: 90 additions & 0 deletions
90
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# PyTorch_TinyImageNet | ||
|
||
## **How to run this tutorial (without TLC and locally as a simulation):** | ||
<br/> | ||
|
||
Before we dive in, let's clarify some terms. XPU is a term coined by Intel to describe their line of computing devices, which includes CPUs, GPUs, FPGAs, and other accelerators. In this tutorial, we will be focusing on the Intel® Data Center GPU Max Series model, a GPU that is part of Intel's XPU lineup. | ||
|
||
### 0a. If you haven't done so already, create a virtual environment, install OpenFL, and upgrade pip: | ||
- For help with this step, visit the "Install the Package" section of the [OpenFL installation instructions](https://openfl.readthedocs.io/en/latest/install.html#install-the-package). | ||
|
||
<br/> | ||
|
||
### 0b. Quick XPU Setup | ||
In this tutorial, when we refer to XPU, we are specifically referring to the Intel® Data Center GPU Max Series. When using the Intel® Extension for PyTorch* package, selecting the device as 'xpu' will refer to this Intel® Data Center GPU Max Series. | ||
|
||
For a successful setup, please follow the steps outlined in the [Installation Guide](https://intel.github.io/intel-extension-for-pytorch/xpu/2.1.10+xpu/tutorials/installation.html). This guide provides detailed information on system requirements and the installation process for the Intel® Extension for PyTorch. For a deeper understanding of features, APIs, and technical details, refer to the [Intel® Extension for PyTorch* Documentation](https://intel.github.io/intel-extension-for-pytorch/xpu/2.1.10+xpu/index.html). | ||
|
||
Hardware Prerequisite: Intel® Data Center GPU Max Series. | ||
|
||
This Jupyter Notebook has been tested and confirmed to work with the following versions: | ||
|
||
- intel-extension-for-pytorch==2.0.120 (xpu) | ||
- pytorch==2.0.1 | ||
- torchvision==0.15.2 | ||
|
||
These versions were obtained from official Intel® channels. | ||
|
||
Additionally, the XPU driver version used in testing was: | ||
|
||
- [XPU_Driver==803](https://dgpu-docs.intel.com/driver/installation.html) | ||
|
||
|
||
<br/> | ||
|
||
### 1. Split terminal into 3 (1 terminal for the director, 1 for the envoy, and 1 for the experiment) | ||
|
||
<br/> | ||
|
||
### 2. Do the following in each terminal: | ||
- Activate the virtual environment from step 0: | ||
|
||
```sh | ||
source venv/bin/activate | ||
``` | ||
- If you are in a network environment with a proxy, ensure proxy environment variables are set in each of your terminals. | ||
- Navigate to the tutorial: | ||
|
||
```sh | ||
cd openfl/openfl-tutorials/interactive_api/PyTorch_TinyImageNet | ||
``` | ||
|
||
<br/> | ||
|
||
### 3. In the first terminal, run the director: | ||
|
||
```sh | ||
cd director | ||
./start_director.sh | ||
``` | ||
|
||
<br/> | ||
|
||
### 4. In the second terminal, install requirements and run the envoy: | ||
|
||
```sh | ||
cd envoy | ||
pip install -r requirements.txt | ||
./start_envoy.sh env_one envoy_config.yaml | ||
``` | ||
|
||
Optional: Run a second envoy in an additional terminal: | ||
- Ensure step 2 is complete for this terminal as well. | ||
- Run the second envoy: | ||
```sh | ||
cd envoy | ||
./start_envoy.sh env_two envoy_config.yaml | ||
``` | ||
|
||
<br/> | ||
|
||
### 5. Now that your director and envoy terminals are set up, run the Jupyter Notebook in your experiment terminal: | ||
|
||
```sh | ||
cd workspace | ||
jupyter lab pytorch_tinyimagenet_XPU.ipynb | ||
``` | ||
- A Jupyter Server URL will appear in your terminal. In your browser, proceed to that link. Once the webpage loads, click on the pytorch_tinyimagenet.ipynb file. | ||
- To run the experiment, select the icon that looks like two triangles to "Restart Kernel and Run All Cells". | ||
- You will notice activity in your terminals as the experiment runs, and when the experiment is finished the director terminal will display a message that the experiment has finished successfully. | ||
|
5 changes: 5 additions & 0 deletions
5
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/director_config.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
settings: | ||
listen_host: localhost | ||
listen_port: 50051 | ||
sample_shape: ['64', '64', '3'] | ||
target_shape: ['64', '64'] |
4 changes: 4 additions & 0 deletions
4
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/start_director.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
fx director start --disable-tls -c director_config.yaml |
4 changes: 4 additions & 0 deletions
4
...fl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/start_director_with_tls.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
FQDN=$1 | ||
fx director start -c director_config.yaml -rc cert/root_ca.crt -pk cert/"${FQDN}".key -oc cert/"${FQDN}".crt |
10 changes: 10 additions & 0 deletions
10
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/envoy_config.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
params: | ||
cuda_devices: [] | ||
|
||
optional_plugin_components: {} | ||
|
||
shard_descriptor: | ||
template: tinyimagenet_shard_descriptor.TinyImageNetShardDescriptor | ||
params: | ||
data_folder: tinyimagenet_data | ||
rank_worldsize: 1,1 |
1 change: 1 addition & 0 deletions
1
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/requirements.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Pillow==10.0.1 |
4 changes: 4 additions & 0 deletions
4
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/start_envoy.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
#!/bin/bash | ||
set -e | ||
|
||
fx envoy start -n env_one --disable-tls --envoy-config-path envoy_config.yaml -dh localhost -dp 50051 |
6 changes: 6 additions & 0 deletions
6
openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/start_envoy_with_tls.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#!/bin/bash | ||
set -e | ||
ENVOY_NAME=$1 | ||
DIRECTOR_FQDN=$2 | ||
|
||
fx envoy start -n "$ENVOY_NAME" --envoy-config-path envoy_config.yaml -dh "$DIRECTOR_FQDN" -dp 50051 -rc cert/root_ca.crt -pk cert/"$ENVOY_NAME".key -oc cert/"$ENVOY_NAME".crt |
120 changes: 120 additions & 0 deletions
120
...tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/tinyimagenet_shard_descriptor.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,120 @@ | ||
# Copyright (C) 2020-2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
"""TinyImageNet Shard Descriptor.""" | ||
|
||
import glob | ||
import logging | ||
import os | ||
import shutil | ||
from pathlib import Path | ||
from typing import Tuple | ||
|
||
from PIL import Image | ||
|
||
from openfl.interface.interactive_api.shard_descriptor import ShardDataset | ||
from openfl.interface.interactive_api.shard_descriptor import ShardDescriptor | ||
|
||
logger = logging.getLogger(__name__) | ||
|
||
|
||
class TinyImageNetDataset(ShardDataset): | ||
"""TinyImageNet shard dataset class.""" | ||
|
||
NUM_IMAGES_PER_CLASS = 500 | ||
|
||
def __init__(self, data_folder: Path, data_type='train', rank=1, worldsize=1): | ||
"""Initialize TinyImageNetDataset.""" | ||
self.data_type = data_type | ||
self._common_data_folder = data_folder | ||
self._data_folder = os.path.join(data_folder, data_type) | ||
self.labels = {} # fname - label number mapping | ||
self.image_paths = sorted( | ||
glob.iglob( | ||
os.path.join(self._data_folder, '**', '*.JPEG'), | ||
recursive=True | ||
) | ||
)[rank - 1::worldsize] | ||
wnids_path = os.path.join(self._common_data_folder, 'wnids.txt') | ||
with open(wnids_path, 'r', encoding='utf-8') as fp: | ||
self.label_texts = sorted([text.strip() for text in fp.readlines()]) | ||
self.label_text_to_number = {text: i for i, text in enumerate(self.label_texts)} | ||
self.fill_labels() | ||
|
||
def __len__(self) -> int: | ||
"""Return the len of the shard dataset.""" | ||
return len(self.image_paths) | ||
|
||
def __getitem__(self, index: int) -> Tuple['Image', int]: | ||
"""Return an item by the index.""" | ||
file_path = self.image_paths[index] | ||
label = self.labels[os.path.basename(file_path)] | ||
return self.read_image(file_path), label | ||
|
||
def read_image(self, path: Path) -> Image: | ||
"""Read the image.""" | ||
img = Image.open(path) | ||
return img | ||
|
||
def fill_labels(self) -> None: | ||
"""Fill labels.""" | ||
if self.data_type == 'train': | ||
for label_text, i in self.label_text_to_number.items(): | ||
for cnt in range(self.NUM_IMAGES_PER_CLASS): | ||
self.labels[f'{label_text}_{cnt}.JPEG'] = i | ||
elif self.data_type == 'val': | ||
val_annotations_path = os.path.join(self._data_folder, 'val_annotations.txt') | ||
with open(val_annotations_path, 'r', encoding='utf-8') as fp: | ||
for line in fp.readlines(): | ||
terms = line.split('\t') | ||
file_name, label_text = terms[0], terms[1] | ||
self.labels[file_name] = self.label_text_to_number[label_text] | ||
|
||
|
||
class TinyImageNetShardDescriptor(ShardDescriptor): | ||
"""Shard descriptor class.""" | ||
|
||
def __init__( | ||
self, | ||
data_folder: str = 'data', | ||
rank_worldsize: str = '1,1', | ||
**kwargs | ||
): | ||
"""Initialize TinyImageNetShardDescriptor.""" | ||
self.common_data_folder = Path.cwd() / data_folder | ||
self.data_folder = Path.cwd() / data_folder / 'tiny-imagenet-200' | ||
self.download_data() | ||
self.rank, self.worldsize = tuple(int(num) for num in rank_worldsize.split(',')) | ||
|
||
def download_data(self): | ||
"""Download prepared shard dataset.""" | ||
zip_file_path = self.common_data_folder / 'tiny-imagenet-200.zip' | ||
os.makedirs(self.common_data_folder, exist_ok=True) | ||
os.system(f'wget --no-clobber http://cs231n.stanford.edu/tiny-imagenet-200.zip' | ||
f' -O {zip_file_path}') | ||
shutil.unpack_archive(str(zip_file_path), str(self.common_data_folder)) | ||
|
||
def get_dataset(self, dataset_type): | ||
"""Return a shard dataset by type.""" | ||
return TinyImageNetDataset( | ||
data_folder=self.data_folder, | ||
data_type=dataset_type, | ||
rank=self.rank, | ||
worldsize=self.worldsize | ||
) | ||
|
||
@property | ||
def sample_shape(self): | ||
"""Return the sample shape info.""" | ||
return ['64', '64', '3'] | ||
|
||
@property | ||
def target_shape(self): | ||
"""Return the target shape info.""" | ||
return ['64', '64'] | ||
|
||
@property | ||
def dataset_description(self) -> str: | ||
"""Return the shard dataset description.""" | ||
return (f'TinyImageNetDataset dataset, shard number {self.rank}' | ||
f' out of {self.worldsize}') |
Oops, something went wrong.