Skip to content

Commit

Permalink
Updates to the PAPI GitHub CI to add individual component tests, tests
Browse files Browse the repository at this point in the history
for the counter analysis toolkit, and tests for the PAPI framework. This
includes an additional script being made called run_tests_shlib.sh,
which will only run if we have used --with-shlib-tools during configure.
  • Loading branch information
Treece Burgess committed Nov 26, 2024
1 parent 09fee01 commit addea10
Show file tree
Hide file tree
Showing 24 changed files with 759 additions and 91 deletions.
43 changes: 43 additions & 0 deletions .github/workflows/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
As of now, the GitHub CI is designed to run in three instances:
1. A per component basis, meaning if a component's codebase is updated then we only will run CI tests for that component. As an example, if we update `cupti_profiler.c` in `src/components/cuda` then we will only run CI tests for that component. Note that this includes updates to subdirectories located in a component's directory (e.g. `src/components/cuda/tests`).
2. A change to the Counter Analysis Toolkit i.e. in the `src/counter_analysis_toolkit` directory and any subdirectories.
3. A change in the PAPI framework i.e. in the `src/` directory (excluding individual components and the Counter Analysis Toolkit). If this occurs then we will run a full test suite.


# Per Component Basis
All per component basis tests have a `.yml` that is structured with `componentName_component.yml`. As
an example for the `cuda` component we would have a `.yml` of `cuda_component.yml`. Therefore,
if a new component is added to PAPI, you will need to create a `.yml` based on the aforementioned structure.

Along with creating the `.yml` file, you will need to add an associated workflow. Below is a skeleton that can
be used as a starting point. As a reminder, make sure to change the necessary fields out for your component.

```yml
name: cuda # replace cuda with your component name

on:
pull_request:
paths:
- 'src/components/cuda/**' # replace the cuda path with your component

jobs:
component_tests:
strategy:
matrix:
component: [cuda] # replace cuda with your component name
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, nvidia_gpu]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: cuda component tests # replace cuda with your component name
run: .github/workflows/ci_per_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}}
````

# Counter Analysis Toolkit
The Counter Analysis Toolkit (CAT) CI uses the `cat.yml` and `ci_cat.sh` files. Any updates to the CI for CAT need to be done in these two files.

# PAPI Framework
The PAPI framework CI uses the `papi_framework.yml` and `ci_papi_framework.sh` files. Any updates to the CI for the PAPI framework need to be done in these two files.
24 changes: 24 additions & 0 deletions .github/workflows/appio_component.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: appio

on:
pull_request:
# run CI only if appio directory or appio sub-directories receive updates
paths:
- 'src/components/appio/**'
# allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
component_tests:
strategy:
matrix:
component: [appio]
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, cpu_intel]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: appio component tests
run: .github/workflows/ci_per_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}}
20 changes: 20 additions & 0 deletions .github/workflows/cat.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
name: counter analysis toolkit

on:
pull_request:
# run CI for updates to counter analysis toolkit
paths:
- 'src/counter_analysis_toolkit/**'
jobs:
component_tests:
strategy:
matrix:
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, cpu_intel]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: counter analysis toolkit tests
run: .github/workflows/ci_cat.sh ${{matrix.debug}} ${{matrix.shlib}}
53 changes: 53 additions & 0 deletions .github/workflows/ci_cat.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/bin/bash -e

DEBUG=$1
SHLIB=$2
COMPILER=$3

[ -z "$COMPILER" ] && COMPILER=gcc@11

source /etc/profile
set +x
set -e
trap 'echo "# $BASH_COMMAND"' DEBUG
shopt -s expand_aliases

module load $COMPILER

cd src

# configuring and installing PAPI
if [ "$SHLIB" = "without" ]; then
./configure --prefix=$PWD/cat-ci --with-debug=$DEBUG --enable-warnings
else
./configure --prefix=$PWD/cat-ci --with-debug=$DEBUG --enable-warnings --with-shlib-tools
fi
make -j4 && make install

# set environment variables for CAT
export PAPI_DIR=$PWD/cat-ci
export LD_LIBRARY_PATH=${PAPI_DIR}/lib:$LD_LIBRARY_PATH
cd counter_analysis_toolkit

# check detected architecture was correct
# note that the make here will finish
DETECTED_ARCH=$(make | grep -o 'ARCH.*' | head -n 1)
if [ "$DETECTED_ARCH" != "ARCH=X86" ]; then
echo "Failed to detect appropriate architecture."
exit 1
fi

# create output directory
mkdir OUT_DIR
# create real and fake events to monitor
echo "BR_INST_RETIRED 0" > event_list.txt
echo "PAPI_CI_FAKE_EVENT 0" >> event_list.txt
./cat_collect -in event_list.txt -out OUT_DIR -branch

cd OUT_DIR
# we expect this file to exist and have values
[ -f BR_INST_RETIRED.branch ]
[ -s BR_INST_RETIRED.branch ]
# we expect this file to exist but be empty
[ -f PAPI_CI_FAKE_EVENT.branch ]
[ ! -s PAPI_CI_FAKE_EVENT.branch ]
71 changes: 71 additions & 0 deletions .github/workflows/ci_papi_framework.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
#!/bin/bash -e

COMPONENTS=$1
DEBUG=$2
SHLIB=$3
COMPILER=$4

[ -z "$COMPILER" ] && COMPILER=gcc@11

source /etc/profile
set +x
set -e
trap 'echo "# $BASH_COMMAND"' DEBUG
shopt -s expand_aliases

module load $COMPILER

cd src

# set necessary environment variables for lmsensors
case "$COMPONENTS" in
*"lmsensors"*)
wget https://github.com/groeck/lm-sensors/archive/V3-4-0.tar.gz
tar -zxf V3-4-0.tar.gz
cd lm-sensors-3-4-0
make install PREFIX=../lm ETCDIR=../lm/etc
cd ..
export PAPI_LMSENSORS_ROOT=lm
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_LMSENSORS_ROOT/lib
;;
esac

# set necessary environment variables for rocm and rocm_smi
case "$COMPONENTS" in
*"rocm rocm_smi"*)
export PAPI_ROCM_ROOT=`ls -d /opt/rocm-*`
export PAPI_ROCMSMI_ROOT=$PAPI_ROCM_ROOT/rocm_smi
;;
esac

# set necessary environment variables for cuda and nvml
case "$COMPONENTS" in
*"cuda nvml"*)
module load cuda
export PAPI_CUDA_ROOT=$ICL_CUDA_ROOT
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_CUDA_ROOT/extras/CUPTI/lib64
;;
esac

# test linking with or without --with-shlib-tools
if [ "$SHLIB" = "without" ]; then
./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENTS"
./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENTS" --with-shlib-tools
else
./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENTS" --with-shlib-tools
fi

make -j4

# run PAPI utilities
utils/papi_component_avail

# without '--with-shlib-tools' in ./configure
if [ "$SHLIB" = "without" ]; then
echo "Running full test suite for active components"
./run_tests.sh TESTS_QUIET --disable-cuda-events=yes
# with '--with-shlib-tools' in ./configure
else
echo "Running single component test for active components"
./run_tests_shlib.sh TESTS_QUIET
fi
42 changes: 22 additions & 20 deletions .github/workflows/ci.sh → .github/workflows/ci_per_component.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

COMPONENT=$1
DEBUG=$2
COMPILER=$3
SHLIB=$3
COMPILER=$4

[ -z "$COMPILER" ] && COMPILER=gcc@11

Expand All @@ -16,47 +17,48 @@ module load $COMPILER

cd src

# lmsensors environment variables
if [ "$COMPONENT" = "lmsensors" ]; then
wget https://github.com/groeck/lm-sensors/archive/V3-4-0.tar.gz
tar -zxf V3-4-0.tar.gz
cd lm-sensors-3-4-0
make install PREFIX=../lm ETCDIR=../lm/etc
cd ..
export PAPI_LMSENSORS_ROOT=lm
export PAPI_LMSENSORS_INC=$PAPI_LMSENSORS_ROOT/include/sensors
export PAPI_LMSENSORS_LIB=$PAPI_LMSENSORS_ROOT/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_LMSENSORS_ROOT/lib
fi

if [ "$COMPONENT" = "cuda" ] || [ "$COMPONENT" = "nvml" ]; then
module load cuda
export PAPI_CUDA_ROOT=$ICL_CUDA_ROOT
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_CUDA_ROOT/extras/CUPTI/lib64
fi

# rocm and rocm_smi environment variables
if [ "$COMPONENT" = "rocm" ] || [ "$COMPONENT" = "rocm_smi" ]; then
export PAPI_ROCM_ROOT=`ls -d /opt/rocm-*`
export PAPI_ROCMSMI_ROOT=$PAPI_ROCM_ROOT/rocm_smi
fi

if [ "$COMPONENT" = "infiniband_umad" ]; then
export PAPI_INFINIBAND_UMAD_ROOT=/usr
# set necessary environemnt variables for cuda and nvml
if [ "$COMPONENT" = "cuda" ] || [ "$COMPONENT" = "nvml" ]; then
module load cuda
export PAPI_CUDA_ROOT=$ICL_CUDA_ROOT
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PAPI_CUDA_ROOT/extras/CUPTI/lib64
fi

if [ "$COMPONENT" = "perf_event" ]; then
./configure --with-debug=$DEBUG --enable-warnings
# test linking with or without --with-shlib-tools
if [ "$SHLIB" = "without" ]; then
./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENT"
else
./configure --with-debug=$DEBUG --enable-warnings --with-components=$COMPONENT
./configure --with-debug=$DEBUG --enable-warnings --with-components="$COMPONENT" --with-shlib-tools
fi

make -j4

# run PAPI utilities
utils/papi_component_avail

# Make sure the $COMPONENT is active
utils/papi_component_avail | grep -A1000 'Active components' | grep -q "Name: $COMPONENT "

if [ "$COMPONENT" != "cuda" ]; then
echo Testing
./run_tests.sh
# without '--with-shlib-tools' in ./configure
if [ "$SHLIB" = "without" ]; then
echo "Running full test suite for active components"
./run_tests.sh TESTS_QUIET --disable-cuda-events=yes
# with '--with-shlib-tools' in ./configure
else
echo "Running single component test for active components"
./run_tests_shlib.sh TESTS_QUIET
fi
24 changes: 24 additions & 0 deletions .github/workflows/coretemp_component.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: coretemp

on:
pull_request:
# run CI only if coretemp directory or coretemp sub-directories receive updates
paths:
- 'src/components/coretemp/**'
# allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
component_tests:
strategy:
matrix:
component: [coretemp]
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, cpu_intel]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: coretemp component tests
run: .github/workflows/ci_per_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}}
24 changes: 24 additions & 0 deletions .github/workflows/cuda_component.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: cuda

on:
pull_request:
# run CI only if cuda directory or cuda sub-directories receive updates
paths:
- 'src/components/cuda/**'
# allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
component_tests:
strategy:
matrix:
component: [cuda]
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, gpu_nvidia]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: cuda component tests
run: .github/workflows/ci_per_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}}
24 changes: 24 additions & 0 deletions .github/workflows/example_component.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: example

on:
pull_request:
# run CI only if example directory or example sub-directories receive updates
paths:
- 'src/components/example/**'
# allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
component_tests:
strategy:
matrix:
component: [example]
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, cpu_intel]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: example component tests
run: .github/workflows/ci_per_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}}
24 changes: 24 additions & 0 deletions .github/workflows/intel_gpu_component.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
name: intel_gpu

on:
pull_request:
# run CI only if intel_gpu directory or intel_gpu sub-directories receive updates
paths:
- 'src/components/intel_gpu/**'
# allows you to run this workflow manually from the Actions tab
workflow_dispatch:

jobs:
component_tests:
strategy:
matrix:
component: [intel_gpu]
debug: [yes, no]
shlib: [with, without]
fail-fast: false
runs-on: [self-hosted, gpu_intel]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: intel_gpu component tests
run: .github/workflows/ci_per_component.sh ${{matrix.component}} ${{matrix.debug}} ${{matrix.shlib}}
Loading

0 comments on commit addea10

Please sign in to comment.