securefederatedai · psfoley · Feb 16, 2024 · Dec 29, 2023 · Dec 29, 2023 · Feb 2, 2024
diff --git a/openfl-tutorials/experimental/Workflow_Interface_104_MNIST_XPU.ipynb b/openfl-tutorials/experimental/Workflow_Interface_104_MNIST_XPU.ipynb
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/README.md b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/README.md
@@ -0,0 +1,90 @@
+# PyTorch_TinyImageNet
+
+## **How to run this tutorial (without TLC and locally as a simulation):**
+<br/>
+
+Before we dive in, let's clarify some terms. XPU is a term coined by Intel to describe their line of computing devices, which includes CPUs, GPUs, FPGAs, and other accelerators. In this tutorial, we will be focusing on the Intel® Data Center GPU Max Series model, a GPU that is part of Intel's XPU lineup.
+
+### 0a. If you haven't done so already, create a virtual environment, install OpenFL, and upgrade pip:
+  - For help with this step, visit the "Install the Package" section of the [OpenFL installation instructions](https://openfl.readthedocs.io/en/latest/install.html#install-the-package).
+
+<br/>
+
+### 0b. Quick XPU Setup
+  In this tutorial, when we refer to XPU, we are specifically referring to the Intel® Data Center GPU Max Series. When using the Intel® Extension for PyTorch* package, selecting the device as 'xpu' will refer to this Intel® Data Center GPU Max Series.
+
+  For a successful setup, please follow the steps outlined in the [Installation Guide](https://intel.github.io/intel-extension-for-pytorch/xpu/2.1.10+xpu/tutorials/installation.html). This guide provides detailed information on system requirements and the installation process for the Intel® Extension for PyTorch. For a deeper understanding of features, APIs, and technical details, refer to the [Intel® Extension for PyTorch* Documentation](https://intel.github.io/intel-extension-for-pytorch/xpu/2.1.10+xpu/index.html).
+
+Hardware Prerequisite: Intel® Data Center GPU Max Series.
+
+This Jupyter Notebook has been tested and confirmed to work with the following versions:
+
+  - intel-extension-for-pytorch==2.0.120 (xpu)
+  - pytorch==2.0.1
+  - torchvision==0.15.2
+
+These versions were obtained from official Intel® channels.
+
+Additionally, the XPU driver version used in testing was:
+
+  - [XPU_Driver==803](https://dgpu-docs.intel.com/driver/installation.html)
+
+
+<br/>
+
+### 1. Split terminal into 3 (1 terminal for the director, 1 for the envoy, and 1 for the experiment)
+
+<br/> 
+
+### 2. Do the following in each terminal:
+   - Activate the virtual environment from step 0:
+
+   ```sh
+   source venv/bin/activate
+   ```
+   - If you are in a network environment with a proxy, ensure proxy environment variables are set in each of your terminals.
+   - Navigate to the tutorial:
+
+   ```sh
+   cd openfl/openfl-tutorials/interactive_api/PyTorch_TinyImageNet
+   ```
+
+<br/>
+
+### 3. In the first terminal, run the director:
+
+```sh
+cd director
+./start_director.sh
+```
+
+<br/>
+
+### 4. In the second terminal, install requirements and run the envoy:
+
+```sh
+cd envoy
+pip install -r requirements.txt
+./start_envoy.sh env_one envoy_config.yaml
+```
+
+Optional: Run a second envoy in an additional terminal:
+  - Ensure step 2 is complete for this terminal as well.
+  - Run the second envoy:
+```sh
+cd envoy
+./start_envoy.sh env_two envoy_config.yaml
+```
+
+<br/>
+
+### 5. Now that your director and envoy terminals are set up, run the Jupyter Notebook in your experiment terminal:
+
+```sh
+cd workspace
+jupyter lab pytorch_tinyimagenet_XPU.ipynb
+```
+- A Jupyter Server URL will appear in your terminal. In your browser, proceed to that link. Once the webpage loads, click on the pytorch_tinyimagenet.ipynb file. 
+- To run the experiment, select the icon that looks like two triangles to "Restart Kernel and Run All Cells". 
+- You will notice activity in your terminals as the experiment runs, and when the experiment is finished the director terminal will display a message that the experiment has finished successfully.  
+
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/director_config.yaml b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/director_config.yaml
@@ -0,0 +1,5 @@
+settings:
+  listen_host: localhost
+  listen_port: 50051
+  sample_shape: ['64', '64', '3']
+  target_shape: ['64', '64']
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/start_director.sh b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/start_director.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+set -e
+
+fx director start --disable-tls -c director_config.yaml
diff --git a/...fl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/start_director_with_tls.sh b/...fl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/director/start_director_with_tls.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+set -e
+FQDN=$1
+fx director start -c director_config.yaml -rc cert/root_ca.crt -pk cert/"${FQDN}".key -oc cert/"${FQDN}".crt
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/envoy_config.yaml b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/envoy_config.yaml
@@ -0,0 +1,10 @@
+params:
+  cuda_devices: []
+
+optional_plugin_components: {}
+
+shard_descriptor:
+  template: tinyimagenet_shard_descriptor.TinyImageNetShardDescriptor
+  params:
+    data_folder: tinyimagenet_data
+    rank_worldsize: 1,1
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/requirements.txt b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/requirements.txt
@@ -0,0 +1 @@
+Pillow==10.0.1
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/start_envoy.sh b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/start_envoy.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+set -e
+
+fx envoy start -n env_one --disable-tls --envoy-config-path envoy_config.yaml -dh localhost -dp 50051
diff --git a/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/start_envoy_with_tls.sh b/openfl-tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/start_envoy_with_tls.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+set -e
+ENVOY_NAME=$1
+DIRECTOR_FQDN=$2
+
+fx envoy start -n "$ENVOY_NAME" --envoy-config-path envoy_config.yaml -dh "$DIRECTOR_FQDN" -dp 50051 -rc cert/root_ca.crt -pk cert/"$ENVOY_NAME".key -oc cert/"$ENVOY_NAME".crt
diff --git a/...tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/tinyimagenet_shard_descriptor.py b/...tutorials/interactive_api/PyTorch_TinyImageNet_XPU/envoy/tinyimagenet_shard_descriptor.py
@@ -0,0 +1,120 @@
+# Copyright (C) 2020-2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+"""TinyImageNet Shard Descriptor."""
+
+import glob
+import logging
+import os
+import shutil
+from pathlib import Path
+from typing import Tuple
+
+from PIL import Image
+
+from openfl.interface.interactive_api.shard_descriptor import ShardDataset
+from openfl.interface.interactive_api.shard_descriptor import ShardDescriptor
+
+logger = logging.getLogger(__name__)
+
+
+class TinyImageNetDataset(ShardDataset):
+    """TinyImageNet shard dataset class."""
+
+    NUM_IMAGES_PER_CLASS = 500
+
+    def __init__(self, data_folder: Path, data_type='train', rank=1, worldsize=1):
+        """Initialize TinyImageNetDataset."""
+        self.data_type = data_type
+        self._common_data_folder = data_folder
+        self._data_folder = os.path.join(data_folder, data_type)
+        self.labels = {}  # fname - label number mapping
+        self.image_paths = sorted(
+            glob.iglob(
+                os.path.join(self._data_folder, '**', '*.JPEG'),
+                recursive=True
+            )
+        )[rank - 1::worldsize]
+        wnids_path = os.path.join(self._common_data_folder, 'wnids.txt')
+        with open(wnids_path, 'r', encoding='utf-8') as fp:
+            self.label_texts = sorted([text.strip() for text in fp.readlines()])
+        self.label_text_to_number = {text: i for i, text in enumerate(self.label_texts)}
+        self.fill_labels()
+
+    def __len__(self) -> int:
+        """Return the len of the shard dataset."""
+        return len(self.image_paths)
+
+    def __getitem__(self, index: int) -> Tuple['Image', int]:
+        """Return an item by the index."""
+        file_path = self.image_paths[index]
+        label = self.labels[os.path.basename(file_path)]
+        return self.read_image(file_path), label
+
+    def read_image(self, path: Path) -> Image:
+        """Read the image."""
+        img = Image.open(path)
+        return img
+
+    def fill_labels(self) -> None:
+        """Fill labels."""
+        if self.data_type == 'train':
+            for label_text, i in self.label_text_to_number.items():
+                for cnt in range(self.NUM_IMAGES_PER_CLASS):
+                    self.labels[f'{label_text}_{cnt}.JPEG'] = i
+        elif self.data_type == 'val':
+            val_annotations_path = os.path.join(self._data_folder, 'val_annotations.txt')
+            with open(val_annotations_path, 'r', encoding='utf-8') as fp:
+                for line in fp.readlines():
+                    terms = line.split('\t')
+                    file_name, label_text = terms[0], terms[1]
+                    self.labels[file_name] = self.label_text_to_number[label_text]
+
+
+class TinyImageNetShardDescriptor(ShardDescriptor):
+    """Shard descriptor class."""
+
+    def __init__(
+            self,
+            data_folder: str = 'data',
+            rank_worldsize: str = '1,1',
+            **kwargs
+    ):
+        """Initialize TinyImageNetShardDescriptor."""
+        self.common_data_folder = Path.cwd() / data_folder
+        self.data_folder = Path.cwd() / data_folder / 'tiny-imagenet-200'
+        self.download_data()
+        self.rank, self.worldsize = tuple(int(num) for num in rank_worldsize.split(','))
+
+    def download_data(self):
+        """Download prepared shard dataset."""
+        zip_file_path = self.common_data_folder / 'tiny-imagenet-200.zip'
+        os.makedirs(self.common_data_folder, exist_ok=True)
+        os.system(f'wget --no-clobber http://cs231n.stanford.edu/tiny-imagenet-200.zip'
+                  f' -O {zip_file_path}')
+        shutil.unpack_archive(str(zip_file_path), str(self.common_data_folder))
+
+    def get_dataset(self, dataset_type):
+        """Return a shard dataset by type."""
+        return TinyImageNetDataset(
+            data_folder=self.data_folder,
+            data_type=dataset_type,
+            rank=self.rank,
+            worldsize=self.worldsize
+        )
+
+    @property
+    def sample_shape(self):
+        """Return the sample shape info."""
+        return ['64', '64', '3']
+
+    @property
+    def target_shape(self):
+        """Return the target shape info."""
+        return ['64', '64']
+
+    @property
+    def dataset_description(self) -> str:
+        """Return the shard dataset description."""
+        return (f'TinyImageNetDataset dataset, shard number {self.rank}'
+                f' out of {self.worldsize}')