From 48caf2303d4b953d74b3caba0f8fc4ad94c9cdd8 Mon Sep 17 00:00:00 2001 From: Ralf Gommers Date: Wed, 18 Dec 2024 08:53:29 +0100 Subject: [PATCH 1/5] Fix build warning about discarding volatile qualifier in memory.c The warning was: ``` [4339/5327] Building C object driver/others/CMakeFiles/driver_others.dir/memory.c.o /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c: In function 'blas_shutdown': /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:3257:10: warning: passing argument 1 of 'free' discards 'volatile' qualifier from pointer target type [-Wdiscarded-qualifiers] 3257 | free(newmemory); | ^~~~~~~~~ In file included from /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/common.h:83, from /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:74: /home/rgommers/code/pixi-dev-scipystack/openblas/.pixi/envs/default/x86_64-conda-linux-gnu/sysroot/usr/include/stdlib.h:482:25: note: expected 'void *' but argument is of type 'volatile struct newmemstruct *' 482 | extern void free (void *__ptr) __THROW; | ~~~~~~^~~~~ ``` The use of `volatile` for `newmemstruct` seems on purpose, and there are more such constructs in this file. The warning appeared after gh-4451 and is correct. The `free` prototype doesn't expect a volatile pointer, hence this change adds a cast to silence the warning. --- driver/others/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/driver/others/memory.c b/driver/others/memory.c index 6343a3785e..276e39ece0 100644 --- a/driver/others/memory.c +++ b/driver/others/memory.c @@ -3254,7 +3254,7 @@ void blas_shutdown(void){ #endif newmemory[pos].lock = 0; } - free(newmemory); + free((void*)newmemory); newmemory = NULL; memory_overflowed = 0; } From 765ad8bcd2bee89d8393a2200a6777989a8d4db0 Mon Sep 17 00:00:00 2001 From: Ralf Gommers Date: Wed, 18 Dec 2024 09:39:07 +0100 Subject: [PATCH 2/5] Fix guard around `alloc_hugetlb`, fixes compile warning The warning was: ``` /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c: At top level: /home/rgommers/code/pixi-dev-scipystack/openblas/OpenBLAS/driver/others/memory.c:2565:14: warning: 'alloc_hugetlb' defined but not used [-Wunused-function] 2565 | static void *alloc_hugetlb(void *address){ | ^~~~~~~~~~~~~ ``` The added define is the same as is already present in the TLS part of `memory.c`. This follows up on gh-4681. --- driver/others/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/driver/others/memory.c b/driver/others/memory.c index 276e39ece0..c53e798bc1 100644 --- a/driver/others/memory.c +++ b/driver/others/memory.c @@ -2538,7 +2538,7 @@ static void *alloc_shm(void *address){ } #endif -#if defined OS_LINUX || defined OS_AIX || defined __sun__ || defined OS_WINDOWS +#if ((defined ALLOC_HUGETLB) && (defined OS_LINUX || defined OS_AIX || defined __sun__ || defined OS_WINDOWS)) static void alloc_hugetlb_free(struct release_t *release){ From e460512685b3004c3796b4620c1454150cf61ef0 Mon Sep 17 00:00:00 2001 From: Martin Kroeker Date: Thu, 19 Dec 2024 00:50:37 +0100 Subject: [PATCH 3/5] Update WoA build instructions from rewording in issue #5001 --- docs/install.md | 66 +++++++++++++++++++++++++++++++------------------ 1 file changed, 42 insertions(+), 24 deletions(-) diff --git a/docs/install.md b/docs/install.md index b842d3355b..7155263056 100644 --- a/docs/install.md +++ b/docs/install.md @@ -437,36 +437,54 @@ To then use the built OpenBLAS shared library in Visual Studio: [Qt Creator](http://qt.nokia.com/products/developer-tools/). -#### Windows on Arm - -While OpenBLAS can be built with Microsoft VisualStudio (Community Edition or commercial), you would only be able to build for the GENERIC target -that does not use optimized assembly kernels, also the stock VisualStudio lacks the Fortran compiler necessary for building the LAPACK component. -It is therefore highly recommended to download the free LLVM compiler suite and use it to compile OpenBLAS outside of VisualStudio. - -The following tools needs to be installed to build for Windows on Arm (WoA): - -- LLVM for Windows on Arm. - Find the latest LLVM build for WoA from [LLVM release page](https://releases.llvm.org/) - you want the package whose name ends in "woa64.exe". - (This may not always be present in the very latest point release, as building and uploading the binaries takes time.) - E.g: a LLVM 19 build for WoA64 can be found [here](https://github.com/llvm/llvm-project/releases/download/llvmorg-19.1.2/LLVM-19.1.2-woa64.exe). - Run the LLVM installer and ensure that LLVM is added to the environment variable PATH. (If you do not want to add it to the PATH, you will need to specify - both C and Fortran compiler to Make or CMake with their full path later on) +## Windows on Arm + +A fully functional native OpenBLAS for WoA that can be built as both a static and dynamic library using LLVM toolchain and Visual Studio 2022. Before starting to build, make sure that you have installed Visual Studio 2022 on your ARM device, including the "Desktop Development with C++" component (that contains the cmake tool). +(Note that you can use the free "Visual Studio 2022 Community Edition" for this task. In principle it would be possible to build with VisualStudio alone, but using +the LLVM toolchain enables native compilation of the Fortran sources of LAPACK and of all the optimized assembly files, which VisualStudio cannot handle on its own) + + 1. Clone OpenBLAS to your local machine and checkout to latest release of OpenBLAS (unless you want to build the latest development snapshot - here we are using the 0.3.28 release as the example, of course this exact version may be outdated by the time you read this) + + ```cmd + git clone https://github.com/OpenMathLib/OpenBLAS.git + cd OpenBLAS + git checkout v0.3.28 + ``` + + 2. Install Latest LLVM toolchain for WoA: + + Download the Latest LLVM toolchain for WoA from [the Release page](https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.5). At the time of writing, this is version 19.1.5 - be sure to select the latest release for which you can find a precompiled package whose name ends in "-woa64.exe" (precompiled packages + usually lag a week or two behind their corresponding source release). + Make sure to enable the option “Add LLVM to the system PATH for all the users” + Note: Make sure that the path of LLVM toolchain is at the top of Environment Variables section to avoid conflicts between the set of compilers available in the system path + + 3. Launch the Native Command Prompt for Windows ARM64: + + From the start menu search for “ARM64 Native Tools Command Prompt for Visual Studio 2022” + Alternatively open command prompt, run the following command to activate the environment: + "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsarm64.bat" + + Navigate to the OpenBLAS source code directory and start building OpenBLAS by invoking Ninja: + + ```cmd + cd OpenBLAS + mkdir build + cd build + + cmake .. -G Ninja -DCMAKE_BUILD_TYPE=Release -DTARGET=ARMV8 -DBINARY=64 -DCMAKE_C_COMPILER=clang-cl -DCMAKE_C_COMPILER=arm64-pc-windows-msvc -DCMAKE_ASM_COMPILER=arm64-pc-windows-msvc -DCMAKE_Fortran_COMPILER=flang-new -The following steps describe how to build the static library for OpenBLAS with either Make or CMake: + ninja -j16 + ``` + +Note: You might want to include additional options in the cmake command here. For example, the default configuration only generates a static.lib version of the library. If you prefer a DLL, you can add -DBUILD_SHARED_LIBS=ON. -1. Build OpenBLAS with Make: +Note that it is also possible to use the same setup to build OpenBLAS with Make, if you prepare Makefiles over the CMake build for some reason: - ```bash + ```cmd $ make CC=clang-cl FC=flang-new AR="llvm-ar" TARGET=ARMV8 ARCH=arm64 RANLIB="llvm-ranlib" MAKE=make ``` -2. Build OpenBLAS with CMake - ```bash - $ mkdir build - $ cd build - $ cmake .. -G Ninja -DCMAKE_C_COMPILER=clang-cl -DCMAKE_Fortran_COMPILER=flang-new -DTARGET=ARMV8 -DCMAKE_BUILD_TYPE=Release - $ cmake --build . - ``` + #### Generating an import library From a93d3db34a7e2fe70bbeb3a43c20323d85802a74 Mon Sep 17 00:00:00 2001 From: Martin Kroeker Date: Thu, 19 Dec 2024 00:53:10 +0100 Subject: [PATCH 4/5] fix formatting of WoA section --- docs/install.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/install.md b/docs/install.md index 7155263056..5bb88cccd8 100644 --- a/docs/install.md +++ b/docs/install.md @@ -437,13 +437,13 @@ To then use the built OpenBLAS shared library in Visual Studio: [Qt Creator](http://qt.nokia.com/products/developer-tools/). -## Windows on Arm +### Windows on Arm A fully functional native OpenBLAS for WoA that can be built as both a static and dynamic library using LLVM toolchain and Visual Studio 2022. Before starting to build, make sure that you have installed Visual Studio 2022 on your ARM device, including the "Desktop Development with C++" component (that contains the cmake tool). (Note that you can use the free "Visual Studio 2022 Community Edition" for this task. In principle it would be possible to build with VisualStudio alone, but using the LLVM toolchain enables native compilation of the Fortran sources of LAPACK and of all the optimized assembly files, which VisualStudio cannot handle on its own) - 1. Clone OpenBLAS to your local machine and checkout to latest release of OpenBLAS (unless you want to build the latest development snapshot - here we are using the 0.3.28 release as the example, of course this exact version may be outdated by the time you read this) +1. Clone OpenBLAS to your local machine and checkout to latest release of OpenBLAS (unless you want to build the latest development snapshot - here we are using the 0.3.28 release as the example, of course this exact version may be outdated by the time you read this) ```cmd git clone https://github.com/OpenMathLib/OpenBLAS.git @@ -451,20 +451,20 @@ the LLVM toolchain enables native compilation of the Fortran sources of LAPACK a git checkout v0.3.28 ``` - 2. Install Latest LLVM toolchain for WoA: +2. Install Latest LLVM toolchain for WoA: - Download the Latest LLVM toolchain for WoA from [the Release page](https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.5). At the time of writing, this is version 19.1.5 - be sure to select the latest release for which you can find a precompiled package whose name ends in "-woa64.exe" (precompiled packages - usually lag a week or two behind their corresponding source release). - Make sure to enable the option “Add LLVM to the system PATH for all the users” - Note: Make sure that the path of LLVM toolchain is at the top of Environment Variables section to avoid conflicts between the set of compilers available in the system path +Download the Latest LLVM toolchain for WoA from [the Release page](https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.5). At the time of writing, this is version 19.1.5 - be sure to select the latest release for which you can find a precompiled package whose name ends in "-woa64.exe" (precompiled packages +usually lag a week or two behind their corresponding source release). +Make sure to enable the option “Add LLVM to the system PATH for all the users” +Note: Make sure that the path of LLVM toolchain is at the top of Environment Variables section to avoid conflicts between the set of compilers available in the system path - 3. Launch the Native Command Prompt for Windows ARM64: +3. Launch the Native Command Prompt for Windows ARM64: - From the start menu search for “ARM64 Native Tools Command Prompt for Visual Studio 2022” - Alternatively open command prompt, run the following command to activate the environment: - "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsarm64.bat" +From the start menu search for “ARM64 Native Tools Command Prompt for Visual Studio 2022” +Alternatively open command prompt, run the following command to activate the environment: +"C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvarsarm64.bat" - Navigate to the OpenBLAS source code directory and start building OpenBLAS by invoking Ninja: +Navigate to the OpenBLAS source code directory and start building OpenBLAS by invoking Ninja: ```cmd cd OpenBLAS From 1c4401ebf16dd4ff3c0de8a7517bea9724a63a45 Mon Sep 17 00:00:00 2001 From: Martin Kroeker Date: Thu, 19 Dec 2024 14:32:24 -0800 Subject: [PATCH 5/5] Add target-specific options to enable SVE with the NVIDIA compiler --- Makefile.arm64 | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/Makefile.arm64 b/Makefile.arm64 index fccc0d0d0f..2909a83e0e 100644 --- a/Makefile.arm64 +++ b/Makefile.arm64 @@ -351,4 +351,31 @@ endif endif +else +# NVIDIA HPC options necessary to enable SVE in the compiler +ifeq ($(CORE), THUNDERX2T99) +CCOMMON_OPT += -tp=thunderx2t99 +FCOMMON_OPT += -tp=thunderx2t99 +endif +ifeq ($(CORE), NEOVERSEN1) +CCOMMON_OPT += -tp=neoverse-n1 +FCOMMON_OPT += -tp=neoverse-n1 +endif +ifeq ($(CORE), NEOVERSEV1) +CCOMMON_OPT += -tp=neoverse-v1 +FCOMMON_OPT += -tp=neoverse-v1 +endif +ifeq ($(CORE), NEOVERSEV2) +CCOMMON_OPT += -tp=neoverse-v2 +FCOMMON_OPT += -tp=neoverse-v2 +endif +ifeq ($(CORE), ARMV8SVE) +CCOMMON_OPT += -tp=neoverse-v2 +FCOMMON_OPT += -tp=neoverse-v2 +endif +ifeq ($(CORE), ARMV9SVE) +CCOMMON_OPT += -tp=neoverse-v2 +FCOMMON_OPT += -tp=neoverse-v2 +endif + endif