Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BLAS] Add BLAS ARM performance libraries backend. #629

Merged
merged 7 commits into from
Jan 31, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,11 @@ if(ENABLE_MKLCPU_BACKEND)
option(ENABLE_MKLCPU_THREAD_TBB "Enable the use of Intel TBB with the oneMath CPU backend" ON)
endif()

option(ENABLE_ARMPL_BACKEND "Enable the ArmPl backend for BLAS/LAPACK interface" OFF)
if(ENABLE_ARMPL_BACKEND)
option(ENABLE_ARMPL_OMP "Enable OpenMP for the ArmPl backend" ON)
endif()

# blas
option(ENABLE_CUBLAS_BACKEND "Enable the cuBLAS backend for the BLAS interface" OFF)
option(ENABLE_ROCBLAS_BACKEND "Enable the rocBLAS backend for the BLAS interface" OFF)
Expand Down Expand Up @@ -88,7 +93,8 @@ if(ENABLE_MKLCPU_BACKEND
OR ENABLE_CUBLAS_BACKEND
OR ENABLE_ROCBLAS_BACKEND
OR ENABLE_NETLIB_BACKEND
OR ENABLE_GENERIC_BLAS_BACKEND)
OR ENABLE_GENERIC_BLAS_BACKEND
OR ENABLE_ARMPL_BACKEND)
list(APPEND DOMAINS_LIST "blas")
endif()
if(ENABLE_MKLCPU_BACKEND
Expand Down
28 changes: 23 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@ oneMath is part of the [UXL Foundation](http://www.uxlfoundation.org).
</thead>
<tbody>
<tr>
<td rowspan=14 align="center">oneMath</td>
<td rowspan=14 align="center">oneMath selector</td>
<td rowspan=15 align="center">oneMath</td>
<td rowspan=15 align="center">oneMath selector</td>
<td align="center"><a href="https://software.intel.com/en-us/oneapi/onemkl">Intel(R) oneAPI Math Kernel Library (oneMKL)</a></td>
<td align="center">x86 CPU, Intel GPU</td>
</tr>
Expand All @@ -45,8 +45,12 @@ oneMath is part of the [UXL Foundation](http://www.uxlfoundation.org).
<td align="center">NVIDIA GPU</td>
</tr>
<tr>
<td align="center"><a href="https://ww.netlib.org"> NETLIB LAPACK</a> </td>
<td align="center">x86 CPU</td>
<td align="center"><a href="https://www.netlib.org"> NETLIB LAPACK</a> </td>
<td align="center">x86 and aarch64 CPU</td>
</tr>
<tr>
<td align="center"><a href="https://www.arm.com/products/development-tools/server-and-hpc/allinea-studio/performance-libraries">Arm Performance Libraries</a></td>
<td align="center">aarch64 CPU</td>
</tr>
<tr>
<td align="center"><a href="https://rocblas.readthedocs.io/en/rocm-4.5.2/"> AMD rocBLAS</a></td>
Expand Down Expand Up @@ -180,7 +184,7 @@ Supported compilers include:
</thead>
<tbody>
<tr>
<td rowspan=10 align="center">BLAS</td>
<td rowspan=12 align="center">BLAS</td>
<td rowspan=3 align="center">x86 CPU</td>
<td align="center">Intel(R) oneMKL</td>
<td align="center">Intel DPC++</br>AdaptiveCpp</td>
Expand All @@ -196,6 +200,17 @@ Supported compilers include:
<td align="center">Intel DPC++</br>Open DPC++</td>
<td align="center">Dynamic, Static</td>
</tr>
<tr>
<td rowspan=2 align="center">aarch64 CPU</td>
<td align="center">Arm Performance Libraries</td>
<td align="center">Open DPC++</br>AdaptiveCpp</td>
<td align="center">Dynamic, Static</td>
</tr>
<tr>
<td align="center">NETLIB LAPACK</td>
<td align="center">Open DPC++</br>AdaptiveCpp</td>
<td align="center">Dynamic, Static</td>
</tr>
<tr>
<td rowspan=2 align="center">Intel GPU</td>
<td align="center">Intel(R) oneMKL</td>
Expand Down Expand Up @@ -432,6 +447,7 @@ Supported compilers include:
- Intel Atom(R) Processors
- Intel(R) Core(TM) Processor Family
- Intel(R) Xeon(R) Processor Family
- Arm Neoverse Processor Family (tested on N1, V1, V2)
- Accelerators
- Intel(R) Arc(TM) A-Series Graphics
- Intel(R) Data Center GPU Max Series
Expand All @@ -447,6 +463,7 @@ Supported compilers include:
Backend | Supported Operating System
:--- | :---
x86 CPU | Red Hat Enterprise Linux* 9 (RHEL* 9)
aarch64 CPU| Red Hat Enterprise Linux* 9 (RHEL* 9)
Intel GPU | Ubuntu 24.04 LTS
NVIDIA GPU | Ubuntu 22.04 LTS

Expand Down Expand Up @@ -551,6 +568,7 @@ Product | Supported Version | License
[NETLIB LAPACK](https://www.netlib.org/) | [5d4180c](https://github.com/Reference-LAPACK/lapack/commit/5d4180cf8288ae6ad9a771d18793d15bd0c5643c) | [BSD like license](http://www.netlib.org/lapack/LICENSE.txt)
[Generic SYCL BLAS](https://github.com/uxlfoundation/generic-sycl-components/tree/main/onemath/sycl/blas) | 0.1 | [Apache License v2.0](https://github.com/uxlfoundation/generic-sycl-components/blob/main/LICENSE)
[portFFT](https://github.com/codeplaysoftware/portFFT) | 0.1 | [Apache License v2.0](https://github.com/codeplaysoftware/portFFT/blob/main/LICENSE)
[Arm Performance Libraries](https://developer.arm.com/downloads/-/arm-performance-libraries) | 22.0.1 or higher | [EULA](https://developer.arm.com/downloads/-/arm-performance-libraries/eula)

---

Expand Down
81 changes: 81 additions & 0 deletions cmake/FindARMPL.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
#===============================================================================
# Copyright 2025 SiPearl
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions
# and limitations under the License.
#
#
# SPDX-License-Identifier: Apache-2.0
#===============================================================================

include_guard()
set(ARMPL_SEQ armpl_intp64)
set(ARMPL_OMP armpl_int64_mp)

include(FindPackageHandleStandardArgs)
if(ENABLE_ARMPL_OMP)
message(STATUS "Use OpenMP version of ArmPL")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fopenmp")
find_library(ARMPL_LIBRARY NAMES ${ARMPL_OMP} HINTS ${ARMPL_ROOT} $ENV{ARMPLROOT} PATH_SUFFIXES lib lib64)
else()
message(STATUS "Use Sequential version of ArmPL")
find_library(ARMPL_LIBRARY NAMES ${ARMPL_SEQ} HINTS ${ARMPL_ROOT} $ENV{ARMPLROOT} PATH_SUFFIXES lib lib64)
endif()
find_package_handle_standard_args(ARMPL REQUIRED_VARS ARMPL_LIBRARY)

get_filename_component(ARMPL_LIB_DIR ${ARMPL_LIBRARY} DIRECTORY)
find_path(ARMPL_INCLUDE armpl.h HINTS ${ARMPL_ROOT} $ENV{ARMPLROOT} PATH_SUFFIXES include)
#cmake replaces fullpath to libarmpl by -larmpl (because SONAME is absent) and -Wl,-rpath is not enough for some compilers as hint
#so we need to add -L to compiler, otherwise we need to set LIBRARY_PATH manually when building
if(UNIX)
list(APPEND ARMPL_LINK "-Wl,-rpath,${ARMPL_LIB_DIR} -L${ARMPL_LIB_DIR}")
endif()
list(APPEND ARMPL_LINK ${ARMPL_LIBRARY})
list(APPEND ARMPL_LINK ${ARMPL_LIBRARY})
message(${ARMPL_LINK})
find_package_handle_standard_args(ARMPL REQUIRED_VARS ARMPL_INCLUDE ARMPL_LINK)

# Check ARMPL version (only versions higher or equal to 22.0.1 are supported)
set(ARMPL_MAJOR 22)
set(ARMPL_MINOR 0)
set(ARMPL_BUILD 1)
file(WRITE ${CMAKE_BINARY_DIR}/armplversion.cpp
"#include <stdio.h>\n"
"\n"
"#include \"armpl.h\"\n"
"\n"
"int main(void) {\n"
" int major, minor, build;\n"
" char *tag;\n"
" armplversion(&major, &minor, &build, (const char **)&tag);\n"
" if (major > MAJOR) {\n"
" return 0;\n"
" }\n"
" else if (major == MAJOR && minor > MINOR) {\n"
" return 0;\n"
" }\n"
" else if (major == MAJOR && minor == MINOR && build >= BUILD) {\n"
" return 0;\n"
" }\n"
" printf(\"You are using version %d.%d.%d\\n\", major, minor, build);\n"
" return 1;\n"
"}\n")
execute_process(COMMAND ${CMAKE_CXX_COMPILER} armplversion.cpp -O0 -I${ARMPL_INCLUDE} -Wl,-rpath,${ARMPL_LIB_DIR} -larmpl -DMAJOR=${ARMPL_MAJOR} -DMINOR=${ARMPL_MINOR} -DBUILD=${ARMPL_BUILD} WORKING_DIRECTORY ${CMAKE_BINARY_DIR})
execute_process(COMMAND ./a.out WORKING_DIRECTORY ${CMAKE_BINARY_DIR} RESULT_VARIABLE ARMPL_CHECK_VERSION)
execute_process(COMMAND rm ./a.out WORKING_DIRECTORY ${CMAKE_BINARY_DIR})
execute_process(COMMAND rm armplversion.cpp WORKING_DIRECTORY ${CMAKE_BINARY_DIR})
if(ARMPL_CHECK_VERSION)
message(FATAL_ERROR "ARMPL backend does not support ARMPL version prior to version ${ARMPL_MAJOR}.${ARMPL_MINOR}.${ARMPL_BUILD}")
endif()

add_library(ONEMKL::ARMPL::ARMPL UNKNOWN IMPORTED)
set_target_properties(ONEMKL::ARMPL::ARMPL PROPERTIES IMPORTED_LOCATION ${ARMPL_LIBRARY})
3 changes: 3 additions & 0 deletions docs/building_the_project_with_adaptivecpp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,9 @@ The most important supported build options are:
* - ENABLE_NETLIB_BACKEND
- True, False
- False
* - ENABLE_ARMPL_BACKEND
- True, False
- False
* - ENABLE_ROCBLAS_BACKEND
- True, False
- False
Expand Down
20 changes: 20 additions & 0 deletions docs/building_the_project_with_dpcpp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,12 @@ The most important supported build options are:
* - ENABLE_NETLIB_BACKEND
- True, False
- False
* - ENABLE_ARMPL_BACKEND
- True, False
- False
* - ENABLE_ARMPL_OMP
- True, False
- True
* - ENABLE_ROCBLAS_BACKEND
- True, False
- False
Expand Down Expand Up @@ -314,6 +320,20 @@ specified. See `DPC++ User Manual
<https://intel.github.io/llvm-docs/UsersManual.html>`_ for more information on
``-fsycl-targets``.

.. _build_for_armpl_dpcpp:

Building for Arm Performance Libraries
--------------------------------------

`Arm Performance Libraries <https://developer.arm.com/Tools%20and%20Software/Arm%20Performance%20Libraries>`_
backend is enabled on aarch64 platform by setting ``-DENABLE_ARMPL_BACKEND=True``.

By default, it will look for the ``ARMPLROOT`` environment variable. If another
ArmPL is to be used, ``-DARMPL_ROOT=<armpl_install_prefix>`` can be used.

Default behavior is to used the OpenMP flavor of ArmPL libraries, this can be
changed using the ``-DENABLE_ARMPL_OMP=True/False`` flag.

.. _build_additional_options_dpcpp:

Additional Build Options
Expand Down
3 changes: 3 additions & 0 deletions include/oneapi/math/blas.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,9 @@
#ifdef ONEMATH_ENABLE_NETLIB_BACKEND
#include "oneapi/math/blas/detail/netlib/blas_ct.hpp"
#endif
#ifdef ONEMATH_ENABLE_ARMPL_BACKEND
#include "oneapi/math/blas/detail/armpl/blas_ct.hpp"
#endif
#ifdef ONEMATH_ENABLE_GENERIC_BLAS_BACKEND
#include "oneapi/math/blas/detail/generic/blas_ct.hpp"
#endif
Expand Down
59 changes: 59 additions & 0 deletions include/oneapi/math/blas/detail/armpl/blas_ct.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
/*******************************************************************************
* Copyright 2025 SiPearl
* Copyright 2020-2021 Intel Corporation
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions
* and limitations under the License.
*
*
* SPDX-License-Identifier: Apache-2.0
*******************************************************************************/

#ifndef _DETAIL_ARMPL_BLAS_CT_HPP__
#define _DETAIL_ARMPL_BLAS_CT_HPP__

#if __has_include(<sycl/sycl.hpp>)
#include <sycl/sycl.hpp>
#else
#include <CL/sycl.hpp>
#endif
#include <complex>
#include <cstdint>

#include "oneapi/math/types.hpp"
#include "oneapi/math/detail/backend_selector.hpp"

#include "oneapi/math/blas/detail/blas_ct_backends.hpp"
#include "oneapi/math/blas/detail/armpl/onemath_blas_armpl.hpp"

namespace oneapi {
namespace math {
namespace blas {
namespace column_major {

#define MAJOR column_major
#include "blas_ct.hxx"
#undef MAJOR

} //namespace column_major
namespace row_major {

#define MAJOR row_major
#include "blas_ct.hxx"
#undef MAJOR

} //namespace row_major
} //namespace blas
} //namespace math
} //namespace oneapi

#endif //_DETAIL_ARMPL_BLAS_CT_HPP_
Loading
Loading