Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLIS #97

Merged
merged 5 commits into from
Jan 5, 2025
Merged

BLIS #97

merged 5 commits into from
Jan 5, 2025

Conversation

mgates3
Copy link
Collaborator

@mgates3 mgates3 commented Dec 22, 2024

  • Remove AMD's ACML, which hasn't existed for some years (since 2018 or earlier).
  • Add BLIS and libFLAME. This is what AMD AOCL uses.
  • Tested on ICL's cluster.

See also icl-utk-edu/lapackpp#73, though it inherits most stuff (LIBS) from BLAS++.

@mgates3
Copy link
Collaborator Author

mgates3 commented Dec 25, 2024

Test on ICL cluster with make. Compile and test output abbreviated (...).

sh methane blaspp> module purge
sh methane blaspp> module load amd-aocl
Loading amd-aocl/4.2/gcc-11.4.1-5b4uh7
  Loading requirement: glibc/2.34/gcc-11.4.1-dtaqd4 gcc-runtime/11.4.1/gcc-11.4.1-no4eyi bzip2/1.0.8/gcc-11.4.1-n7kvt3 libmd/1.0.4/gcc-11.4.1-crgsfc
    libbsd/0.12.2/gcc-11.4.1-njr4pw expat/2.6.2/gcc-11.4.1-65qru7 ncurses/6.5/gcc-11.4.1-soxhau readline/8.2/gcc-11.4.1-uebdnx gdbm/1.23/gcc-11.4.1-6v2jvd
    libiconv/1.17/gcc-11.4.1-ynxtpm xz/5.4.6/gcc-11.4.1-vuvxyo zlib-ng/2.2.1/gcc-11.4.1-wwmm37 libxml2/2.10.3/gcc-11.4.1-ewigrh pigz/2.8/gcc-11.4.1-ywpabn
    zstd/1.5.6/gcc-11.4.1-qnrc4s tar/1.34/gcc-11.4.1-bvbsgm gettext/0.22.5/gcc-11.4.1-x7sso7 libffi/3.4.6/gcc-11.4.1-jtnh3b libxcrypt/4.4.35/gcc-11.4.1-wvymff
    openssl/3.3.1/gcc-11.4.1-4iiodh sqlite/3.46.0/gcc-11.4.1-kefle4 util-linux-uuid/2.40.2/gcc-11.4.1-csmedo python/3.11.9/gcc-11.4.1-hxxchl amdblis/4.2/gcc-11.4.1-rii3hi
    cuda/11.8.0/gcc-11.4.1-q6jpi3 libpciaccess/0.17/gcc-11.4.1-k5nwmf hwloc/2.9.3/gcc-11.4.1-w2wap2 libevent/2.1.12/gcc-11.4.1-eitmnr numactl/2.0.14/gcc-11.4.1-3tmyb5
    slurm/22.05.9/gcc-11.4.1-5mlc5h check/0.15.2/gcc-11.4.1-ctf5dj gdrcopy/2.4.1/gcc-11.4.1-lx4vdv libnl/3.3.0/gcc-11.4.1-a5eaep rdma-core/52.0/gcc-11.4.1-5kuqln
    ucx/1.15.0/gcc-11.4.1-sr34kx openmpi/5.0.2/gcc-11.4.1-tzydwy berkeley-db/18.1.40/gcc-11.4.1-fxuu75 perl/5.38.2/gcc-11.4.1-iucw2s texinfo/7.1/gcc-11.4.1-6hpk6i
    amdfftw/4.2/gcc-11.4.1-irus4b aocl-utils/4.2/gcc-11.4.1-dakldb openblas/0.3.27/gcc-11.4.1-jfkp5p amdlibflame/4.2/gcc-11.4.1-ad6kxv gmp/6.3.0/gcc-11.4.1-vhhzgh
    mpfr/4.2.1/gcc-11.4.1-55ywxq amdlibm/4.2/gcc-11.4.1-i7u5td amdscalapack/4.2/gcc-11.4.1-sq6j7d aocl-compression/4.2/gcc-11.4.1-wpqg2b aocl-crypto/4.2/gcc-11.4.1-h33cx7
    aocl-libmem/4.2/gcc-11.4.1-wykjyz aocl-sparse/4.2/gcc-11.4.1-ss7elx

sh methane blaspp> module load gcc/12.1
Loading gcc/12.1.0/gcc-11.4.1-bv7hah
  Loading requirement: mpc/1.3.1/gcc-11.4.1-6rsmso

sh methane blaspp> export CPATH=$ICL_AMDBLIS_ROOT/include/blis
sh methane blaspp> export CPATH+=:$ICL_AMDLIBFLAME_ROOT/include

sh methane blaspp> export LIBRARY_PATH=$ICL_AMDBLIS_ROOT/lib
sh methane blaspp> export LIBRARY_PATH+=:$ICL_AMDLIBFLAME_ROOT/lib

sh methane blaspp> export LD_LIBRARY_PATH=$ICL_AMDBLIS_ROOT/lib
sh methane blaspp> export LD_LIBRARY_PATH+=:$ICL_AMDLIBFLAME_ROOT/lib

sh methane blaspp> echo $CPATH | perl -pe 's/:/:\n/g' 
/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/amdblis-4.2-rii3hiwjdsvilh4mnnsgecncmivbycd4/include/blis:
/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/amdlibflame-4.2-ad6kxvpbqj3tpe3kibsdvpul6okoscg3/include

sh methane blaspp> echo $LIBRARY_PATH | perl -pe 's/:/:\n/g' 
/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/amdblis-4.2-rii3hiwjdsvilh4mnnsgecncmivbycd4/lib:
/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/amdlibflame-4.2-ad6kxvpbqj3tpe3kibsdvpul6okoscg3/lib

sh methane blaspp> echo $LD_LIBRARY_PATH | perl -pe 's/:/:\n/g' 
/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/amdblis-4.2-rii3hiwjdsvilh4mnnsgecncmivbycd4/lib:
/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/amdlibflame-4.2-ad6kxvpbqj3tpe3kibsdvpul6okoscg3/lib

sh methane blaspp> make blas=blis
python3 configure.py
--------------------------------------------------------------------------------
                              Welcome to BLAS++.

By default, configure will automatically choose the first valid value it finds
for each option. You can set it to interactive to find all possible values and
give you a choice:
    make config interactive=1

If you have multiple compilers, we suggest specifying your desired compiler by
setting CXX, as the automated search may prefer a different compiler.

For options, see the `INSTALL.md` file.

Configure assumes environment variables CPATH, LIBRARY_PATH, and LD_LIBRARY_PATH
are set so your compiler can find libraries. See INSTALL.md for more details.
--------------------------------------------------------------------------------
opening log file config/log.txt


C++ compiler
Trying $CXX = g++
g++ yes (g++)

C++ compiler flags
-std=c++17                                                               yes 
-O2                                                                      yes 
-MMD                                                                     yes 
-Wall                                                                    yes 
-Wno-unused-local-typedefs                                               yes 
-Wno-unused-function                                                     yes 

OpenMP support
-fopenmp                                                                 yes 

BLAS library
Also detects Fortran name mangling and BLAS integer size.
BLAS (ddot) in:
BLIS
    -lflame -lblis
    -DBLAS_FORTRAN_ADD_                                                  yes 

BLAS (sdot) returns float as float (standard)                            yes 
BLAS (zdotc) returns complex (GNU gfortran convention)                   yes 
BLIS version                                                             yes (AOCL-BLIS 4.2.0 Build 20241025)

CBLAS library
CBLAS (cblas_ddot) in BLAS library                                       yes 

LAPACK library
LAPACK (dpstrf) in BLAS library                                          yes 

GPU BLAS libraries: gpu_backend = auto
CUDA and cuBLAS libraries
    -DBLAS_HAVE_CUBLAS -I/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/include -L/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -Wl,-rpath,/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -lcublas -lcudart yes 
skipping HIP/ROCm search
skipping SYCL search

TestSweeper
../testsweeper                                                           yes 

Output files
creating make.inc
creating include/blas/defines.h
log in config/log.txt
--------------------------------------------------------------------------------
g++ -std=c++17 -O2 -MMD -Wall -Wno-unused-local-typedefs -Wno-unused-function -fopenmp  -I/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/include -fPIC -I./include -c src/asum.cc -o src/asum.o
...
g++ -std=c++17 -O2 -MMD -Wall -Wno-unused-local-typedefs -Wno-unused-function -fopenmp  -I/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/include -fPIC -I./include -I../testsweeper -c test/test_util.cc -o test/test_util.o
mkdir -p lib
g++ -fopenmp -L/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -Wl,-rpath,/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -fPIC -shared -Wl,-soname,libblaspp.so.1 -lcublas -lcudart -lflame -lblis src/asum.o src/axpy.o src/batch_gemm.o src/batch_hemm.o src/batch_her2k.o src/batch_herk.o src/batch_symm.o src/batch_syr2k.o src/batch_syrk.o src/batch_trmm.o src/batch_trsm.o src/copy.o src/cublas_wrappers.o src/device_axpy.o src/device_batch_gemm.o src/device_batch_gemm_group.o src/device_batch_hemm.o src/device_batch_her2k.o src/device_batch_herk.o src/device_batch_symm.o src/device_batch_syr2k.o src/device_batch_syrk.o src/device_batch_trmm.o src/device_batch_trsm.o src/device_copy.o src/device_dot.o src/device_error.o src/device_gemm.o src/device_hemm.o src/device_her2k.o src/device_herk.o src/device_nrm2.o src/device_queue.o src/device_scal.o src/device_swap.o src/device_symm.o src/device_syr2k.o src/device_syrk.o src/device_trmm.o src/device_trsm.o src/device_utils.o src/dot.o src/gemm.o src/gemv.o src/ger.o src/hemm.o src/hemv.o src/her.o src/her2.o src/her2k.o src/herk.o src/iamax.o src/nrm2.o src/onemkl_wrappers.o src/rocblas_wrappers.o src/rot.o src/rotg.o src/rotm.o src/rotmg.o src/scal.o src/swap.o src/symm.o src/symv.o src/syr.o src/syr2.o src/syr2k.o src/syrk.o src/trmm.o src/trmv.o src/trsm.o src/trsv.o src/util.o src/version.o -o lib/libblaspp.so.1.0.0
ln -fs libblaspp.so.1.0.0 lib/libblaspp.so.1
ln -fs libblaspp.so.1 lib/libblaspp.so
g++ -L./lib -Wl,-rpath,/home/mgates/repos/blaspp/lib -L../testsweeper -Wl,-rpath,/home/mgates/repos/testsweeper -fopenmp -L/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -Wl,-rpath,/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -fPIC test/cblas_wrappers.o test/lapack_wrappers.o test/test.o test/test_asum.o test/test_axpy.o test/test_axpy_device.o test/test_batch_gemm.o test/test_batch_gemm_device.o test/test_batch_hemm.o test/test_batch_hemm_device.o test/test_batch_her2k.o test/test_batch_her2k_device.o test/test_batch_herk.o test/test_batch_herk_device.o test/test_batch_symm.o test/test_batch_symm_device.o test/test_batch_syr2k.o test/test_batch_syr2k_device.o test/test_batch_syrk.o test/test_batch_syrk_device.o test/test_batch_trmm.o test/test_batch_trmm_device.o test/test_batch_trsm.o test/test_batch_trsm_device.o test/test_copy.o test/test_copy_device.o test/test_dot.o test/test_dot_device.o test/test_dotu.o test/test_dotu_device.o test/test_error.o test/test_gemm.o test/test_gemm_device.o test/test_gemv.o test/test_ger.o test/test_geru.o test/test_hemm.o test/test_hemm_device.o test/test_hemv.o test/test_her.o test/test_her2.o test/test_her2k.o test/test_her2k_device.o test/test_herk.o test/test_herk_device.o test/test_iamax.o test/test_max.o test/test_memcpy.o test/test_memcpy_2d.o test/test_nrm2.o test/test_nrm2_device.o test/test_rot.o test/test_rotg.o test/test_rotm.o test/test_rotmg.o test/test_scal.o test/test_scal_device.o test/test_schur_gemm.o test/test_swap.o test/test_swap_device.o test/test_symm.o test/test_symm_device.o test/test_symv.o test/test_syr.o test/test_syr2.o test/test_syr2k.o test/test_syr2k_device.o test/test_syrk.o test/test_syrk_device.o test/test_trmm.o test/test_trmm_device.o test/test_trmv.o test/test_trsm.o test/test_trsm_device.o test/test_trsv.o test/test_util.o \
	-lblaspp -ltestsweeper -lcublas -lcudart -lflame -lblis -o test/tester

sh methane blaspp> cd test
sh methane test> ./run_tests.py --quick > q
Wed Dec 25 16:01:28 2024
./tester  --type s,d,c,z --dim 100 --incx 1 asum
pass
...
./tester  --type s,d,c,z --dim 100 --dim 100x50 --dim 50x100 --align 32 set_matrix
pass

All routines passed.
Elapsed 213.83 sec
Wed Dec 25 16:05:02 2024

sh methane blaspp> make install prefix=install-make
perl -pe "s'#VERSION'2024.10.26'; \
          s'#PREFIX'/home/mgates/repos/blaspp/install-make'; \
          s'#CXX\b'g++'; \
          s'#CXXFLAGS'-std=c++17 -I/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/include'; \
          s'#CPPFLAGS''; \
          s'#LDFLAGS'-fopenmp -L/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -Wl,-rpath,/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64'; \
          s'#LIBS'-lcublas -lcudart -lflame -lblis';" \
          lib/pkgconfig/blaspp.pc.in > lib/pkgconfig/blaspp.pc
mkdir -p /home/mgates/repos/blaspp/install-make/include/blas
mkdir -p /home/mgates/repos/blaspp/install-make/lib/pkgconfig
cp include/*.hh      /home/mgates/repos/blaspp/install-make/include/
cp include/blas/*.h  /home/mgates/repos/blaspp/install-make/include/blas/
cp include/blas/*.hh /home/mgates/repos/blaspp/install-make/include/blas/
cp -av lib/libblaspp*  /home/mgates/repos/blaspp/install-make/lib/
'lib/libblaspp.so' -> '/home/mgates/repos/blaspp/install-make/lib/libblaspp.so'
'lib/libblaspp.so.1' -> '/home/mgates/repos/blaspp/install-make/lib/libblaspp.so.1'
'lib/libblaspp.so.1.0.0' -> '/home/mgates/repos/blaspp/install-make/lib/libblaspp.so.1.0.0'
cp lib/pkgconfig/blaspp.pc            /home/mgates/repos/blaspp/install-make/lib/pkgconfig/

sh methane blaspp> PKG_CONFIG_PATH+=:$PWD/install-make/lib/pkgconfig
sh methane blaspp> cd examples/
sh methane examples> make
g++ -std=c++17 -I/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/include -I/home/mgates/repos/blaspp/install-make/include  -c -o example_gemm.o example_gemm.cc
g++ -o example_gemm example_gemm.o -L/home/mgates/repos/blaspp/install-make/lib -Wl,-rpath,/home/mgates/repos/blaspp/install-make/lib -lblaspp -fopenmp -L/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -Wl,-rpath,/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -lcublas -lcudart -lflame -lblis 
g++ -std=c++17 -I/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/include -I/home/mgates/repos/blaspp/install-make/include  -c -o example_util.o example_util.cc
g++ -o example_util example_util.o -L/home/mgates/repos/blaspp/install-make/lib -Wl,-rpath,/home/mgates/repos/blaspp/install-make/lib -lblaspp -fopenmp -L/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -Wl,-rpath,/apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/lib64 -lcublas -lcudart -lflame -lblis 

sh methane examples> ./example_gemm 
m 100, n 200, k 50

void test_gemm(int, int, int) [with T = float]

void test_gemm(int, int, int) [with T = double]

void test_gemm(int, int, int) [with T = std::complex<float>]

void test_gemm(int, int, int) [with T = std::complex<double>]

void test_device_gemm(int, int, int) [with T = float]

void test_device_gemm(int, int, int) [with T = double]

void test_device_gemm(int, int, int) [with T = std::complex<float>]

void test_device_gemm(int, int, int) [with T = std::complex<double>]

sh methane examples> ./example_util 

void test_util(scalar_type) [with scalar_type = float]
norm  10.0000
alpha  1.2340
beta   1.2340

void test_util(scalar_type) [with scalar_type = double]
norm  10.0000
alpha  2.4680
beta   2.4680

void test_util(scalar_type) [with scalar_type = std::complex<float>]
norm  10.0000
alpha  3.1415 +  0.5678i
beta   3.1415 + -0.5678i

void test_util(scalar_type) [with scalar_type = std::complex<double>]
norm  10.0000
alpha  6.2830 +  1.1356i
beta   6.2830 + -1.1356i

@mgates3
Copy link
Collaborator Author

mgates3 commented Dec 25, 2024

Test on ICL cluster with CMake. Compile and test output abbreviated (...). See above make test for modules and paths setup.

sh methane build-blis> module load cmake
Loading cmake/3.29.6/gcc-11.4.1-ia2365
  Loading requirement: mbedtls/2.28.2/gcc-11.4.1-yv3pqs libssh2/1.11.0/gcc-11.4.1-yclasd nghttp2/1.52.0/gcc-11.4.1-ju2tmz curl/8.7.1/gcc-11.4.1-viga6n
    gmake/4.4.1/gcc-11.4.1-a7wm73

sh methane build-blis> cmake -Dblas=blis -DCMAKE_INSTALL_PREFIX=../install-cmake ..
-- The CXX compiler identification is GNU 12.1.0
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/gcc-12.1.0-bv7hah6wty4yixejbcwlp5zpmwwefbi2/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Using CMAKE_INSTALL_PREFIX = /home/mgates/repos/blaspp/install-cmake

-- Looking for CUDA (gpu_backend = auto)
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Building CUDA support

-- No HIP/ROCm support: gpu_backend = cuda

-- No oneMKL-SYCL device support: gpu_backend = cuda

-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- blaspp_id = 194fe68

-- Looking for BLAS libraries and options (blas = blis)
BLIS
   libs:  -lflame -lblis
   -DBLAS_FORTRAN_ADD_                              yes
   Found BLAS library: -lflame;-lblis

-- Checking BLAS library version
   BLIS_VERSION=AOCL-BLIS 4.2.0 Build 20241025

-- Checking BLAS complex return type
   BLAS (zdotc) returns complex (GNU gfortran convention)
-- Checking BLAS float return type
   BLAS (sdot) returns float as float (standard)
-- Checking for CBLAS library
   Found CBLAS library

-- Looking for LAPACK libraries and options (lapack = auto)
   In BLAS library                                  yes
   Found LAPACK library in BLAS library

-- Checking for TestSweeper library

---------- TestSweeper
-- Fetching TestSweeper v2024.05.31 from https://github.com/icl-utk-edu/testsweeper
-- Using CMAKE_INSTALL_PREFIX = /opt/slate
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- testsweeper_id = c16ca65
-- Performing Test fp_model
-- Performing Test fp_model - Failed
-- Performing Test warn_unused
-- Performing Test warn_unused - Success
---------- TestSweeper done

-- Configuring done (5.6s)
-- Generating done (0.0s)
-- Build files have been written to: /home/mgates/repos/blaspp/build-blis
sh methane build-blis>
sh methane build-blis>
sh methane build-blis>
sh methane build-blis> make
[  0%] Building CXX object _deps/testsweeper-build/CMakeFiles/testsweeper.dir/testsweeper.cc.o
...
...
[ 51%] Linking CXX shared library libblaspp.so
[ 51%] Built target blaspp
...
[100%] Linking CXX executable tester
[100%] Built target tester

sh methane build-blis> cd test
sh methane test> ./run_tests.py --quick --host > q2
Wed Dec 25 16:15:37 2024
./tester  --type s,d,c,z --dim 100 --incx 1 asum
pass
...
./tester  --type s,d,c,z --dim 100 --dim 100x50 --dim 50x100 --align 32 set_matrix
pass

All routines passed.
Elapsed 218.87 sec
Wed Dec 25 16:19:16 2024


sh methane test> cd ..
sh methane build-blis> make install
[  2%] Built target testsweeper
[  4%] Built target testsweeper_tester
[ 51%] Built target blaspp
[100%] Built target tester
Install the project...
-- Install configuration: ""
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/libblaspp.so.1.0.0
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/libblaspp.so.1
-- Set non-toolchain portion of runtime path of "/home/mgates/repos/blaspp/install-cmake/lib64/libblaspp.so.1.0.0" to ""
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/libblaspp.so
-- Installing: /home/mgates/repos/blaspp/install-cmake/include
-- Installing: /home/mgates/repos/blaspp/install-cmake/include/blas.hh
...
-- Installing: /home/mgates/repos/blaspp/install-cmake/include/blas/defines.h
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/cmake/blaspp/blasppTargets.cmake
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/cmake/blaspp/blasppTargets-noconfig.cmake
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/cmake/blaspp/blasppConfig.cmake
-- Installing: /home/mgates/repos/blaspp/install-cmake/lib64/cmake/blaspp/blasppConfigVersion.cmake

sh methane build> cmake -DCMAKE_INSTALL_PREFIX=../../install-cmake ..
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Found CUDAToolkit: /apps/spacks/2024-07-19/opt/spack/linux-rocky9-x86_64/gcc-11.4.1/cuda-11.8.0-q6jpi3of5v7cawdtdr6rldslzmqn7jgb/targets/x86_64-linux/include (found version "11.8.89")
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Configuring done (1.1s)
-- Generating done (0.0s)
-- Build files have been written to: /home/mgates/repos/blaspp/examples/build

sh methane build> make
[ 25%] Building CXX object CMakeFiles/example_gemm.dir/example_gemm.cc.o
[ 50%] Building CXX object CMakeFiles/example_util.dir/example_util.cc.o
[ 75%] Linking CXX executable example_util
[100%] Linking CXX executable example_gemm
[100%] Built target example_util
[100%] Built target example_gemm

sh methane build> ./example_gemm
m 100, n 200, k 50

void test_gemm(int, int, int) [with T = float]

void test_gemm(int, int, int) [with T = double]

void test_gemm(int, int, int) [with T = std::complex<float>]

void test_gemm(int, int, int) [with T = std::complex<double>]

void test_device_gemm(int, int, int) [with T = float]

void test_device_gemm(int, int, int) [with T = double]

void test_device_gemm(int, int, int) [with T = std::complex<float>]

void test_device_gemm(int, int, int) [with T = std::complex<double>]

sh methane build> ./example_util

void test_util(scalar_type) [with scalar_type = float]
norm  10.0000
alpha  1.2340
beta   1.2340

void test_util(scalar_type) [with scalar_type = double]
norm  10.0000
alpha  2.4680
beta   2.4680

void test_util(scalar_type) [with scalar_type = std::complex<float>]
norm  10.0000
alpha  3.1415 +  0.5678i
beta   3.1415 + -0.5678i

void test_util(scalar_type) [with scalar_type = std::complex<double>]
norm  10.0000
alpha  6.2830 +  1.1356i
beta   6.2830 + -1.1356i

@G-Ragghianti
Copy link
Contributor

The AMD GPU CI job failed because the compute node running it crashed. I'm not sure why exactly, but it appeared to be due to a problem with the GPU. I've triggered the job to run again.

@mgates3 mgates3 merged commit 23dab30 into icl-utk-edu:master Jan 5, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants