Skip to content

Commit

Permalink
Merge branch 'master' into patch-11
Browse files Browse the repository at this point in the history
  • Loading branch information
mrmhodak authored Dec 18, 2024
2 parents dbf1483 + 647f9f8 commit 233c1eb
Show file tree
Hide file tree
Showing 29 changed files with 467 additions and 942 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ loadgen/build/
libmlperf_loadgen.a
__pycache__/
generated/
*.swp
39 changes: 39 additions & 0 deletions docs/benchmarks/graph/get-rgat-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
hide:
- toc
---

# Graph Neural Network using R-GAT

## Dataset

The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.

=== "Full Dataset"
R-GAT validation run uses the IGBH dataset consisting of 547,306,935 nodes and 5,812,005,639 edges.

### Get Full Dataset
```
cm run script --tags=get,dataset,igbh,_full -j
```

=== "Debug Dataset"
R-GAT debug run uses the IGBH debug dataset(tiny).

### Get Full Dataset
```
cm run script --tags=get,dataset,igbh,_debug -j
```

## Model
The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.

Get the Official MLPerf R-GAT Model

=== "PyTorch"

### PyTorch
```
cm run script --tags=get,ml-model,rgat -j
```

13 changes: 13 additions & 0 deletions docs/benchmarks/graph/rgat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
hide:
- toc
---


# Graph Neural Network using R-GAT


=== "MLCommons-Python"
## MLPerf Reference Implementation in Python

{{ mlperf_inference_implementation_readme (4, "rgat", "reference", devices = ["CPU", "CUDA"]) }}
19 changes: 15 additions & 4 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# MLPerf Inference Benchmarks

## Overview
The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf inference v4.0 round are listed below, categorized by tasks. Under each model you can find its details like the dataset used, reference accuracy, server latency constraints etc.
The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf inference v5.0 round are listed below, categorized by tasks. Under each model you can find its details like the dataset used, reference accuracy, server latency constraints etc.

---

Expand Down Expand Up @@ -80,7 +80,7 @@ The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf infe
- **Server Scenario Latency Constraint**: 130ms
- **Equal Issue mode**: False
- **High accuracy variant**: yes
- **Submission Category**: Datacenter, Edge
- **Submission Category**: Edge

#### [LLAMA2-70B](benchmarks/language/llama2-70b.md)
- **Dataset**: OpenORCA (GPT-4 split, max_seq_len=1024)
Expand Down Expand Up @@ -157,11 +157,22 @@ The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf infe
- **High accuracy variant**: Yes
- **Submission Category**: Datacenter

## Graph Neural Networks
### [R-GAT](benchmarks/graph/rgat.md)
- **Dataset**: Illinois Graph Benchmark Heterogeneous validation dataset
- **Dataset Size**: 788,379
- **QSL Size**: 788,379
- **Number of Parameters**:
- **Reference Model Accuracy**: ACC = ?
- **Server Scenario Latency Constraint**: N/A
- **Equal Issue mode**: True
- **High accuracy variant**: No
- **Submission Category**: Datacenter
---

## Submission Categories
- **Datacenter Category**: All the current inference benchmarks are applicable to the datacenter category.
- **Edge Category**: All benchmarks except DLRMv2, LLAMA2-70B, and Mixtral-8x7B are applicable to the edge category.
- **Datacenter Category**: All benchmarks except bert are applicable to the datacenter category for inference v5.0.
- **Edge Category**: All benchmarks except DLRMv2, LLAMA2-70B, Mixtral-8x7B and R-GAT are applicable to the edge category for v5.0.

## High Accuracy Variants
- **Benchmarks**: `bert`, `llama2-70b`, `gpt-j`, `dlrm_v2`, and `3d-unet` have a normal accuracy variant as well as a high accuracy variant.
Expand Down
160 changes: 85 additions & 75 deletions docs/submission/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,15 @@ hide:

Click [here](https://youtu.be/eI1Hoecc3ho) to view the recording of the workshop: Streamlining your MLPerf Inference results using CM.

=== "CM based benchmark"
Click [here](https://docs.google.com/presentation/d/1cmbpZUpVr78EIrhzyMBnnWnjJrD-mZ2vmSb-yETkTA8/edit?usp=sharing) to view the prposal slide for Common Automation for MLPerf Inference Submission Generation through CM.

=== "CM based results"
If you have followed the `cm run` commands under the individual model pages in the [benchmarks](../index.md) directory, all the valid results will get aggregated to the `cm cache` folder. The following command could be used to browse the structure of inference results folder generated by CM.
### Get results folder structure
```bash
cm find cache --tags=get,mlperf,inference,results,dir | xargs tree
```
=== "Non CM based benchmark"
=== "Non CM based results"
If you have not followed the `cm run` commands under the individual model pages in the [benchmarks](../index.md) directory, please make sure that the result directory is structured in the following way.
```
└── System description ID(SUT Name)
Expand All @@ -35,18 +37,20 @@ Click [here](https://youtu.be/eI1Hoecc3ho) to view the recording of the workshop
| ├── mlperf_log_detail.txt
| ├── mlperf_log_accuracy.json
| └── accuracy.txt
└── Compliance_Test_ID
├── Performance
| └── run_x/#1 run for all scenarios
| ├── mlperf_log_summary.txt
| └── mlperf_log_detail.txt
├── Accuracy
| ├── baseline_accuracy.txt
| ├── compliance_accuracy.txt
| ├── mlperf_log_accuracy.json
| └── accuracy.txt
├── verify_performance.txt
└── verify_accuracy.txt #for TEST01 only
|── Compliance_Test_ID
| ├── Performance
| | └── run_x/#1 run for all scenarios
| | ├── mlperf_log_summary.txt
| | └── mlperf_log_detail.txt
| ├── Accuracy
| | ├── baseline_accuracy.txt
| | ├── compliance_accuracy.txt
| | ├── mlperf_log_accuracy.json
| | └── accuracy.txt
| ├── verify_performance.txt
| └── verify_accuracy.txt #for TEST01 only
|── user.conf
└── measurements.json
```

<details>
Expand All @@ -67,67 +71,69 @@ Once all the results across all the models are ready you can use the following c

## Generate actual submission tree

=== "Closed Edge"
### Closed Edge Submission
```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=closed \
--category=edge \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```

=== "Closed Datacenter"
### Closed Datacenter Submission
```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=closed \
--category=datacenter \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```
=== "Open Edge"
### Open Edge Submission
```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--category=edge \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```
=== "Open Datacenter"
### Closed Datacenter Submission
```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--category=datacenter \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```
=== "Docker run"
### Docker run
=== "Closed"
### Closed Submission
```bash
cm docker script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=closed \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```

=== "Open"
### Open Submission
```bash
cm docker script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```

=== "Native run"
### Native run
=== "Closed"
### Closed Submission
```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=closed \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```

=== "Open"
### Open Submission
```bash
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--submitter=MLCommons \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--quiet
```

* Use `--hw_name="My system name"` to give a meaningful system name. Examples can be seen [here](https://github.com/mlcommons/inference_results_v3.0/tree/main/open/cTuning/systems)

Expand All @@ -137,6 +143,10 @@ Once all the results across all the models are ready you can use the following c

* Use `--results_dir` option to specify the results folder for Non CM based benchmarks

* Use `--category` option to specify the category for which submission is generated(datacenter/edge). By default, the category is taken from `system_meta.json` file located in the SUT root directory.

* Use `--submission_base_dir` to specify the directory to which outputs from preprocess submission script and final submission is to be dumped. No need to provide `--submission_dir` along with this. For `docker run`, use `--submission_base_dir` instead of `--submission_dir`.

The above command should generate "submission.tar.gz" if there are no submission checker issues and you can upload it to the [MLCommons Submission UI](https://submissions-ui.mlcommons.org/submission).

## Aggregate Results in GitHub
Expand Down
50 changes: 50 additions & 0 deletions docs/system_requirements.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# All memory requirements in GB
resnet:
reference:
fp32:
system_memory: 8
accelerator_memory: 4
disk_storage: 25
nvidia:
int8:
system_memory: 8
accelerator_memory: 4
disk_storage: 100
intel:
int8:
system_memory: 8
accelerator_memory: 0
disk_storage: 50
qualcomm:
int8:
system_memory: 8
accelerator_memory: 8
disk_storage: 50
retinanet:
reference:
fp32:
system_memory: 8
accelerator_memory: 8
disk_storage: 200
nvidia:
int8:
system_memory: 8
accelerator_memory: 8
disk_storage: 200
intel:
int8:
system_memory: 8
accelerator_memory: 0
disk_storage: 200
qualcomm:
int8:
system_memory: 8
accelerator_memory: 8
disk_storage: 200
rgat:
reference:
fp32:
system_memory: 768
accelerator_memory: 8
disk_storage: 2300

7 changes: 5 additions & 2 deletions graph/R-GAT/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -232,9 +232,12 @@ docker build . -f dockerfile.gpu -t rgat-gpu
```
Run docker container:
```bash
docker run --rm -it -v $(pwd):/root --gpus all rgat-gpu
docker run --rm -it -v $(pwd):/workspace/root --gpus all rgat-gpu
```
Run benchmark inside the docker container:
Go inside the root folder and run benchmark inside the docker container:
```bash
cd root
python3 main.py --dataset igbh-dgl --dataset-path igbh/ --profile rgat-dgl-full --device gpu [--model-path <path_to_ckpt>] [--in-memory] [--dtype <fp16 or fp32>] [--scenario <SingleStream, MultiStream, Server or Offline>]
```

**NOTE:** For official submissions, this benchmark is required to run in equal issue mode. Please make sure that the flag `rgat.*.sample_concatenate_permutation` is set to one in the [mlperf.conf](../../loadgen/mlperf.conf) file when loadgen is built.
6 changes: 2 additions & 4 deletions graph/R-GAT/dockerfile.gpu
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ RUN apt install -y --no-install-recommends rsync
# Upgrade pip
RUN python3 -m pip install --upgrade pip

RUN pip install torch-geometric torch-scatter torch-sparse -f https://pytorch-geometric.com/whl/torch-2.1.0+cu121.html
RUN pip install dgl -f https://data.dgl.ai/wheels/torch-2.1/cu121/repo.html

COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt
Expand All @@ -35,10 +37,6 @@ RUN cd /tmp && \
pip install pybind11 && \
CFLAGS="-std=c++14" python3 setup.py install

RUN export TORCH_VERSION=$(python -c "import torch; print(torch.__version__)")
RUN pip install torch-geometric torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-${TORCH_VERSION}.html
RUN pip install dgl -f https://data.dgl.ai/wheels/torch-2.1/cu121/repo.html

# Clean up
RUN rm -rf mlperf \
rm requirements.txt
Loading

0 comments on commit 233c1eb

Please sign in to comment.