Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gemm with beta = 0 #99

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

gemm with beta = 0 #99

wants to merge 3 commits into from

Conversation

mgates3
Copy link
Collaborator

@mgates3 mgates3 commented Jan 16, 2025

Fix templated gemm when beta = 0 to not read C, in case C contains NaN or inf values. See #39. Other BLAS should have similar fixes.

Tests with CXXFLAGS += -DBLAS_USE_TEMPLATE:

pangolin blaspp/test> ./tester --type s,z --align 32 --transA n,t,c --transB n,t,c --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 --alpha 0.0 --beta 0.0 gemm
BLAS++ version 2024.10.26, id a140844e,  OpenBLAS 0.3.28 
input: ./tester --type 's,z' --align 32 --transA 'n,t,c' --transB 'n,t,c' --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 --alpha '0.0' --beta '0.0' gemm
                                                                                                                                                            
type  layout   transA   transB       m       n       k      alpha       beta  align     error   time (s)       gflop/s  ref time (s)   ref gflop/s  status  
   s     col  notrans  notrans     100     100     100   0          0            32  0.00e+00   8.00e-06       249.940      5.50e-05        36.364  pass    
   s     col  notrans  notrans     100      50      50   0          0            32  0.00e+00   2.00e-06       249.940      9.00e-06        55.554  pass    
   s     col  notrans  notrans      50     100     100   0          0            32  0.00e+00   9.98e-07      1001.625      4.00e-06       249.940  pass    
   s     col  notrans  notrans      25      50      75   0          0            32  0.00e+00   9.98e-07       187.805      5.00e-06        37.505  pass    
   s     col  notrans    trans     100     100     100   0          0            32  0.00e+00   9.98e-07      2003.250      0.000102        19.608  pass    
   s     col  notrans    trans     100      50      50   0          0            32  0.00e+00   1.00e-06       498.951      2.00e-06       249.940  pass    
   s     col  notrans    trans      50     100     100   0          0            32  0.00e+00   1.00e-06       997.901      1.00e-06       997.901  pass    
   s     col  notrans    trans      25      50      75   0          0            32  0.00e+00   9.98e-07       187.805      3.00e-06        62.524  pass    
   s     col  notrans     conj     100     100     100   0          0            32  0.00e+00   1.00e-06      1995.803      3.20e-05        62.500  pass    
   s     col  notrans     conj     100      50      50   0          0            32  0.00e+00   9.98e-07       500.812      3.00e-06       166.730  pass    
   s     col  notrans     conj      50     100     100   0          0            32  0.00e+00   9.98e-07      1001.625      3.00e-06       333.460  pass    
   s     col  notrans     conj      25      50      75   0          0            32  0.00e+00   9.98e-07       187.805          0.00           inf  pass    
   s     col    trans  notrans     100     100     100   0          0            32  0.00e+00   2.00e-06       999.760      4.40e-05        45.455  pass    
   s     col    trans  notrans     100      50      50   0          0            32  0.00e+00   1.00e-06       498.951      3.00e-06       166.730  pass    
   s     col    trans  notrans      50     100     100   0          0            32  0.00e+00       0.00           inf      2.00e-06       499.880  pass    
   s     col    trans  notrans      25      50      75   0          0            32  0.00e+00       0.00           inf      3.00e-06        62.524  pass    
   s     col    trans    trans     100     100     100   0          0            32  0.00e+00   1.00e-06      1995.803      4.90e-05        40.814  pass    
   s     col    trans    trans     100      50      50   0          0            32  0.00e+00   9.98e-07       500.812      3.00e-06       166.730  pass    
   s     col    trans    trans      50     100     100   0          0            32  0.00e+00   1.00e-06       997.901      3.00e-06       333.460  pass    
   s     col    trans    trans      25      50      75   0          0            32  0.00e+00   3.00e-06        62.446      2.00e-06        93.727  pass    
   s     col    trans     conj     100     100     100   0          0            32  0.00e+00   1.00e-06      1995.803      4.00e-05        49.997  pass    
   s     col    trans     conj     100      50      50   0          0            32  0.00e+00   9.98e-07       500.812      2.00e-06       249.940  pass    
   s     col    trans     conj      50     100     100   0          0            32  0.00e+00   1.00e-06       997.901      3.00e-06       333.460  pass    
   s     col    trans     conj      25      50      75   0          0            32  0.00e+00       0.00           inf      9.98e-07       187.805  pass    
   s     col     conj  notrans     100     100     100   0          0            32  0.00e+00   2.00e-06       999.760      3.80e-05        52.634  pass    
   s     col     conj  notrans     100      50      50   0          0            32  0.00e+00   3.00e-06       166.730      2.00e-06       249.940  pass    
   s     col     conj  notrans      50     100     100   0          0            32  0.00e+00   2.00e-06       499.880      2.00e-06       499.880  pass    
   s     col     conj  notrans      25      50      75   0          0            32  0.00e+00       0.00           inf      9.98e-07       187.805  pass    
   s     col     conj    trans     100     100     100   0          0            32  0.00e+00   2.00e-06      1001.625      4.90e-05        40.814  pass    
   s     col     conj    trans     100      50      50   0          0            32  0.00e+00   2.00e-06       249.940      2.00e-06       249.940  pass    
   s     col     conj    trans      50     100     100   0          0            32  0.00e+00   2.00e-06       499.880      2.00e-06       499.880  pass    
   s     col     conj    trans      25      50      75   0          0            32  0.00e+00   1.00e-06       187.106      1.00e-06       187.106  pass    
   s     col     conj     conj     100     100     100   0          0            32  0.00e+00   2.00e-06       999.760      4.70e-05        42.555  pass    
   s     col     conj     conj     100      50      50   0          0            32  0.00e+00   1.00e-06       498.951      9.98e-07       500.812  pass    
   s     col     conj     conj      50     100     100   0          0            32  0.00e+00   2.00e-06       499.880      2.00e-06       499.880  pass    
   s     col     conj     conj      25      50      75   0          0            32  0.00e+00       0.00           inf      1.00e-06       187.106  pass    

   z     col  notrans  notrans     100     100     100   0          0            32  0.00e+00   5.00e-06      1599.020      0.000972         8.230  pass    
   z     col  notrans  notrans     100      50      50   0          0            32  0.00e+00   4.00e-06       499.880      4.00e-05        50.002  pass    
   z     col  notrans  notrans      50     100     100   0          0            32  0.00e+00   3.00e-06      1333.841      4.30e-05        93.021  pass    
   z     col  notrans  notrans      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      3.90e-05        19.231  pass    
   z     col  notrans    trans     100     100     100   0          0            32  0.00e+00   7.00e-06      1142.886      8.80e-05        90.906  pass    
   z     col  notrans    trans     100      50      50   0          0            32  0.00e+00   2.00e-06       999.760      6.60e-05        30.303  pass    
   z     col  notrans    trans      50     100     100   0          0            32  0.00e+00   2.00e-06      1999.519      3.90e-05       102.564  pass    
   z     col  notrans    trans      25      50      75   0          0            32  0.00e+00   9.98e-07       751.219      4.90e-05        15.305  pass    
   z     col  notrans     conj     100     100     100   0          0            32  0.00e+00   8.00e-06      1000.225      4.90e-05       163.257  pass    
   z     col  notrans     conj     100      50      50   0          0            32  0.00e+00   2.00e-06       999.760      3.50e-05        57.144  pass    
   z     col  notrans     conj      50     100     100   0          0            32  0.00e+00   3.00e-06      1333.841      4.40e-05        90.903  pass    
   z     col  notrans     conj      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      3.50e-05        21.429  pass    
   z     col    trans  notrans     100     100     100   0          0            32  0.00e+00   6.00e-06      1333.841      4.60e-05       173.913  pass    
   z     col    trans  notrans     100      50      50   0          0            32  0.00e+00   2.00e-06       999.760      4.80e-05        41.666  pass    
   z     col    trans  notrans      50     100     100   0          0            32  0.00e+00   2.00e-06      1999.519      3.20e-05       124.999  pass    
   z     col    trans  notrans      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      3.00e-05        25.000  pass    
   z     col    trans    trans     100     100     100   0          0            32  0.00e+00   6.00e-06      1333.841      5.30e-05       150.944  pass    
   z     col    trans    trans     100      50      50   0          0            32  0.00e+00   3.00e-06       666.920      3.70e-05        54.055  pass    
   z     col    trans    trans      50     100     100   0          0            32  0.00e+00   3.00e-06      1333.841      4.60e-05        86.957  pass    
   z     col    trans    trans      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      4.70e-05        15.958  pass    
   z     col    trans     conj     100     100     100   0          0            32  0.00e+00   8.00e-06       999.760      7.10e-05       112.676  pass    
   z     col    trans     conj     100      50      50   0          0            32  0.00e+00   5.00e-06       400.053      4.80e-05        41.666  pass    
   z     col    trans     conj      50     100     100   0          0            32  0.00e+00   3.00e-06      1332.186      4.10e-05        97.560  pass    
   z     col    trans     conj      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      3.90e-05        19.231  pass    
   z     col     conj  notrans     100     100     100   0          0            32  0.00e+00   6.00e-06      1333.841      6.10e-05       131.152  pass    
   z     col     conj  notrans     100      50      50   0          0            32  0.00e+00   3.00e-06       666.093      3.10e-05        64.512  pass    
   z     col     conj  notrans      50     100     100   0          0            32  0.00e+00   2.00e-06      2003.250      5.40e-05        74.077  pass    
   z     col     conj  notrans      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      4.00e-05        18.749  pass    
   z     col     conj    trans     100     100     100   0          0            32  0.00e+00   7.00e-06      1142.886      5.20e-05       153.853  pass    
   z     col     conj    trans     100      50      50   0          0            32  0.00e+00   4.00e-06       499.880      4.30e-05        46.515  pass    
   z     col     conj    trans      50     100     100   0          0            32  0.00e+00   4.00e-06       999.760      3.10e-05       129.024  pass    
   z     col     conj    trans      25      50      75   0          0            32  0.00e+00   1.00e-06       748.426      2.80e-05        26.786  pass    
   z     col     conj     conj     100     100     100   0          0            32  0.00e+00   9.00e-06       888.859      5.60e-05       142.861  pass    
   z     col     conj     conj     100      50      50   0          0            32  0.00e+00   3.00e-06       666.920      7.50e-05        26.668  pass    
   z     col     conj     conj      50     100     100   0          0            32  0.00e+00   5.00e-06       800.106      4.10e-05        97.560  pass    
   z     col     conj     conj      25      50      75   0          0            32  0.00e+00   9.98e-07       751.219      2.90e-05        25.864  pass    
All tests passed for gemm.
pangolin blaspp/test> ./tester --type s,z --align 32 --transA n,t,c --transB n,t,c --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 --alpha 0.0 gemm
BLAS++ version 2024.10.26, id a140844e,  OpenBLAS 0.3.28 
input: ./tester --type 's,z' --align 32 --transA 'n,t,c' --transB 'n,t,c' --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 --alpha '0.0' gemm
                                                                                                                                                            
type  layout   transA   transB       m       n       k      alpha       beta  align     error   time (s)       gflop/s  ref time (s)   ref gflop/s  status  
   s     col  notrans  notrans     100     100     100   0          2.7+1.7i     32  0.00e+00   7.00e-06       285.722      0.000112        17.858  pass    
   s     col  notrans  notrans     100      50      50   0          2.7+1.7i     32  0.00e+00   4.00e-06       125.086      8.00e-06        62.514  pass    
   s     col  notrans  notrans      50     100     100   0          2.7+1.7i     32  0.00e+00   2.00e-06       499.880      3.00e-06       333.460  pass    
   s     col  notrans  notrans      25      50      75   0          2.7+1.7i     32  0.00e+00   9.98e-07       187.805      1.00e-06       187.106  pass    
   s     col  notrans    trans     100     100     100   0          2.7+1.7i     32  0.00e+00   4.00e-06       500.346      4.60e-05        43.478  pass    
   s     col  notrans    trans     100      50      50   0          2.7+1.7i     32  0.00e+00   2.00e-06       250.406      2.00e-06       250.406  pass    
   s     col  notrans    trans      50     100     100   0          2.7+1.7i     32  0.00e+00   3.00e-06       333.460      4.00e-06       250.173  pass    
   s     col  notrans    trans      25      50      75   0          2.7+1.7i     32  0.00e+00   1.00e-06       187.106      9.98e-07       187.805  pass    
   s     col  notrans     conj     100     100     100   0          2.7+1.7i     32  0.00e+00   4.00e-06       500.346      2.80e-05        71.430  pass    
   s     col  notrans     conj     100      50      50   0          2.7+1.7i     32  0.00e+00   3.00e-06       166.523      3.00e-06       166.730  pass    
   s     col  notrans     conj      50     100     100   0          2.7+1.7i     32  0.00e+00   9.00e-06       111.107      2.00e-06       499.880  pass    
   s     col  notrans     conj      25      50      75   0          2.7+1.7i     32  0.00e+00   1.00e-06       187.106      2.00e-06        93.727  pass    
   s     col    trans  notrans     100     100     100   0          2.7+1.7i     32  0.00e+00   4.00e-06       499.880      4.30e-05        46.511  pass    
   s     col    trans  notrans     100      50      50   0          2.7+1.7i     32  0.00e+00   2.00e-06       250.406      3.00e-06       166.523  pass    
   s     col    trans  notrans      50     100     100   0          2.7+1.7i     32  0.00e+00   2.00e-06       499.880      2.00e-06       500.812  pass    
   s     col    trans  notrans      25      50      75   0          2.7+1.7i     32  0.00e+00   1.00e-06       187.106      1.00e-06       187.106  pass    
   s     col    trans    trans     100     100     100   0          2.7+1.7i     32  0.00e+00   3.00e-06       666.093      3.70e-05        54.055  pass    
   s     col    trans    trans     100      50      50   0          2.7+1.7i     32  0.00e+00   2.00e-06       249.940      3.00e-06       166.730  pass    
   s     col    trans    trans      50     100     100   0          2.7+1.7i     32  0.00e+00   3.00e-06       333.460      2.00e-06       499.880  pass    
   s     col    trans    trans      25      50      75   0          2.7+1.7i     32  0.00e+00   1.00e-06       187.106      2.00e-06        93.727  pass    
   s     col    trans     conj     100     100     100   0          2.7+1.7i     32  0.00e+00   3.00e-06       666.093      4.60e-05        43.478  pass    
   s     col    trans     conj     100      50      50   0          2.7+1.7i     32  0.00e+00   2.00e-06       249.940      3.00e-06       166.730  pass    
   s     col    trans     conj      50     100     100   0          2.7+1.7i     32  0.00e+00   3.00e-06       333.046      2.00e-06       499.880  pass    
   s     col    trans     conj      25      50      75   0          2.7+1.7i     32  0.00e+00   1.00e-06       187.106          0.00           inf  pass    
   s     col     conj  notrans     100     100     100   0          2.7+1.7i     32  0.00e+00   9.00e-06       222.215      3.60e-05        55.554  pass    
   s     col     conj  notrans     100      50      50   0          2.7+1.7i     32  0.00e+00   3.00e-06       166.730      3.00e-06       166.730  pass    
   s     col     conj  notrans      50     100     100   0          2.7+1.7i     32  0.00e+00   2.00e-06       499.880      3.00e-06       333.460  pass    
   s     col     conj  notrans      25      50      75   0          2.7+1.7i     32  0.00e+00   9.98e-07       187.805      1.00e-06       187.106  pass    
   s     col     conj    trans     100     100     100   0          2.7+1.7i     32  0.00e+00   4.00e-06       499.880      4.40e-05        45.455  pass    
   s     col     conj    trans     100      50      50   0          2.7+1.7i     32  0.00e+00   2.00e-06       249.940      2.00e-06       250.406  pass    
   s     col     conj    trans      50     100     100   0          2.7+1.7i     32  0.00e+00   3.00e-06       333.460      2.00e-06       499.880  pass    
   s     col     conj    trans      25      50      75   0          2.7+1.7i     32  0.00e+00   2.00e-06        93.727      3.00e-06        62.524  pass    
   s     col     conj     conj     100     100     100   0          2.7+1.7i     32  0.00e+00   4.00e-06       499.880      4.90e-05        40.817  pass    
   s     col     conj     conj     100      50      50   0          2.7+1.7i     32  0.00e+00   2.00e-06       249.940      3.00e-06       166.730  pass    
   s     col     conj     conj      50     100     100   0          2.7+1.7i     32  0.00e+00   2.00e-06       499.880      5.00e-06       199.877  pass    
   s     col     conj     conj      25      50      75   0          2.7+1.7i     32  0.00e+00   9.98e-07       187.805      2.00e-06        93.727  pass    

   z     col  notrans  notrans     100     100     100   0          2.7+1.7i     32  9.11e-20   1.10e-05       727.467      6.40e-05       124.999  pass    
   z     col  notrans  notrans     100      50      50   0          2.7+1.7i     32  2.47e-19   5.00e-06       400.053      4.50e-05        44.443  pass    
   z     col  notrans  notrans      50     100     100   0          2.7+1.7i     32  8.77e-20   3.00e-06      1333.841      5.20e-05        76.921  pass    
   z     col  notrans  notrans      25      50      75   0          2.7+1.7i     32  1.34e-19   2.00e-06       374.910      3.30e-05        22.728  pass    
   z     col  notrans    trans     100     100     100   0          2.7+1.7i     32  9.04e-20   8.00e-06       999.760      5.20e-05       153.842  pass    
   z     col  notrans    trans     100      50      50   0          2.7+1.7i     32  2.44e-19   4.00e-06       499.880      4.80e-05        41.666  pass    
   z     col  notrans    trans      50     100     100   0          2.7+1.7i     32  8.93e-20   3.00e-06      1332.186      4.30e-05        93.021  pass    
   z     col  notrans    trans      25      50      75   0          2.7+1.7i     32  1.33e-19   1.00e-06       748.426      4.10e-05        18.292  pass    
   z     col  notrans     conj     100     100     100   0          2.7+1.7i     32  9.00e-20   7.00e-06      1142.886      4.70e-05       170.206  pass    
   z     col  notrans     conj     100      50      50   0          2.7+1.7i     32  2.50e-19   4.00e-06       499.880      4.30e-05        46.511  pass    
   z     col  notrans     conj      50     100     100   0          2.7+1.7i     32  8.80e-20   3.00e-06      1332.186      3.50e-05       114.289  pass    
   z     col  notrans     conj      25      50      75   0          2.7+1.7i     32  1.28e-19   9.98e-07       751.219      4.60e-05        16.304  pass    
   z     col    trans  notrans     100     100     100   0          2.7+1.7i     32  9.22e-20   8.00e-06       999.760      4.60e-05       173.913  pass    
   z     col    trans  notrans     100      50      50   0          2.7+1.7i     32  2.43e-19   4.00e-06       499.880      4.60e-05        43.478  pass    
   z     col    trans  notrans      50     100     100   0          2.7+1.7i     32  9.30e-20   3.00e-06      1333.841      4.00e-05       100.004  pass    
   z     col    trans  notrans      25      50      75   0          2.7+1.7i     32  1.32e-19   9.98e-07       751.219      3.40e-05        22.058  pass    
   z     col    trans    trans     100     100     100   0          2.7+1.7i     32  8.77e-20   7.00e-06      1142.886      4.50e-05       177.772  pass    
   z     col    trans    trans     100      50      50   0          2.7+1.7i     32  2.52e-19   4.00e-06       499.880      5.10e-05        39.213  pass    
   z     col    trans    trans      50     100     100   0          2.7+1.7i     32  9.05e-20   7.00e-06       571.443      4.30e-05        93.021  pass    
   z     col    trans    trans      25      50      75   0          2.7+1.7i     32  1.32e-19   9.98e-07       751.219      3.70e-05        20.270  pass    
   z     col    trans     conj     100     100     100   0          2.7+1.7i     32  8.97e-20   7.00e-06      1142.886      4.40e-05       181.821  pass    
   z     col    trans     conj     100      50      50   0          2.7+1.7i     32  2.46e-19   5.00e-06       400.053      3.40e-05        58.822  pass    
   z     col    trans     conj      50     100     100   0          2.7+1.7i     32  8.98e-20   4.00e-06       999.760      4.60e-05        86.957  pass    
   z     col    trans     conj      25      50      75   0          2.7+1.7i     32  1.36e-19   2.00e-06       374.910      2.60e-05        28.843  pass    
   z     col     conj  notrans     100     100     100   0          2.7+1.7i     32  8.85e-20   8.00e-06       999.760      4.50e-05       177.772  pass    
   z     col     conj  notrans     100      50      50   0          2.7+1.7i     32  2.44e-19   4.00e-06       499.880      5.10e-05        39.216  pass    
   z     col     conj  notrans      50     100     100   0          2.7+1.7i     32  9.13e-20   3.00e-06      1333.841      3.60e-05       111.107  pass    
   z     col     conj  notrans      25      50      75   0          2.7+1.7i     32  1.31e-19   2.00e-06       374.910      4.70e-05        15.958  pass    
   z     col     conj    trans     100     100     100   0          2.7+1.7i     32  9.03e-20   7.00e-06      1142.886      5.00e-05       160.009  pass    
   z     col     conj    trans     100      50      50   0          2.7+1.7i     32  2.48e-19   3.00e-06       666.093      4.40e-05        45.455  pass    
   z     col     conj    trans      50     100     100   0          2.7+1.7i     32  8.71e-20   4.00e-06       999.760      3.80e-05       105.269  pass    
   z     col     conj    trans      25      50      75   0          2.7+1.7i     32  1.29e-19   9.98e-07       751.219      4.10e-05        18.292  pass    
   z     col     conj     conj     100     100     100   0          2.7+1.7i     32  8.96e-20   7.00e-06      1142.886      4.80e-05       166.665  pass    
   z     col     conj     conj     100      50      50   0          2.7+1.7i     32  2.48e-19   4.00e-06       499.880      4.10e-05        48.784  pass    
   z     col     conj     conj      50     100     100   0          2.7+1.7i     32  9.01e-20   3.00e-06      1333.841      4.50e-05        88.893  pass    
   z     col     conj     conj      25      50      75   0          2.7+1.7i     32  1.32e-19   9.98e-07       751.219      4.00e-05        18.749  pass    
All tests passed for gemm.
pangolin blaspp/test> ./tester --type s,z --align 32 --transA n,t,c --transB n,t,c --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 --beta 0.0 gemm
BLAS++ version 2024.10.26, id a140844e,  OpenBLAS 0.3.28 
input: ./tester --type 's,z' --align 32 --transA 'n,t,c' --transB 'n,t,c' --dim 100 --dim 100x50 --dim 50x100 --dim 25x50x75 --beta '0.0' gemm
                                                                                                                                                            
type  layout   transA   transB       m       n       k      alpha       beta  align     error   time (s)       gflop/s  ref time (s)   ref gflop/s  status  
   s     col  notrans  notrans     100     100     100   3.1+1.4i   0            32  1.59e-08   0.000324         6.173      0.000106        18.867  pass    
   s     col  notrans  notrans     100      50      50   3.1+1.4i   0            32  1.64e-08   0.000129         3.876      2.90e-05        17.243  pass    
   s     col  notrans  notrans      50     100     100   3.1+1.4i   0            32  1.60e-08   0.000215         4.651      2.30e-05        43.478  pass    
   s     col  notrans  notrans      25      50      75   3.1+1.4i   0            32  1.54e-08   4.00e-05         4.688      9.00e-06        20.841  pass    
   s     col  notrans    trans     100     100     100   3.1+1.4i   0            32  1.60e-08   0.000413         4.843      7.50e-05        26.668  pass    
   s     col  notrans    trans     100      50      50   3.1+1.4i   0            32  1.62e-08   9.80e-05         5.102      1.00e-05        50.007  pass    
   s     col  notrans    trans      50     100     100   3.1+1.4i   0            32  1.62e-08   0.000166         6.024      2.50e-05        39.999  pass    
   s     col  notrans    trans      25      50      75   3.1+1.4i   0            32  1.54e-08   3.30e-05         5.682      9.00e-06        20.833  pass    
   s     col  notrans     conj     100     100     100   3.1+1.4i   0            32  1.61e-08   0.000318         6.289      0.000109        18.348  pass    
   s     col  notrans     conj     100      50      50   3.1+1.4i   0            32  1.63e-08   9.00e-05         5.556      1.00e-05        49.988  pass    
   s     col  notrans     conj      50     100     100   3.1+1.4i   0            32  1.61e-08   0.000178         5.618      1.80e-05        55.554  pass    
   s     col  notrans     conj      25      50      75   3.1+1.4i   0            32  1.57e-08   3.50e-05         5.357      7.00e-06        26.786  pass    
   s     col    trans  notrans     100     100     100   3.1+1.4i   0            32  0.00e+00   0.000736         2.717      7.00e-05        28.572  pass    
   s     col    trans  notrans     100      50      50   3.1+1.4i   0            32  0.00e+00   0.000128         3.906      1.10e-05        45.451  pass    
   s     col    trans  notrans      50     100     100   3.1+1.4i   0            32  0.00e+00   0.000359         2.785      2.20e-05        45.451  pass    
   s     col    trans  notrans      25      50      75   3.1+1.4i   0            32  0.00e+00   5.80e-05         3.233      8.00e-06        23.443  pass    
   s     col    trans    trans     100     100     100   3.1+1.4i   0            32  0.00e+00   0.000740         2.703      5.50e-05        36.364  pass    
   s     col    trans    trans     100      50      50   3.1+1.4i   0            32  0.00e+00   0.000130         3.846      1.00e-05        49.988  pass    
   s     col    trans    trans      50     100     100   3.1+1.4i   0            32  0.00e+00   0.000401         2.494      1.90e-05        52.634  pass    
   s     col    trans    trans      25      50      75   3.1+1.4i   0            32  0.00e+00   5.70e-05         3.289      8.00e-06        23.443  pass    
   s     col    trans     conj     100     100     100   3.1+1.4i   0            32  0.00e+00   0.000736         2.717      7.60e-05        26.316  pass    
   s     col    trans     conj     100      50      50   3.1+1.4i   0            32  0.00e+00   0.000130         3.846      1.70e-05        29.414  pass    
   s     col    trans     conj      50     100     100   3.1+1.4i   0            32  0.00e+00   0.000369         2.710      2.00e-05        49.997  pass    
   s     col    trans     conj      25      50      75   3.1+1.4i   0            32  0.00e+00   6.30e-05         2.976      8.00e-06        23.432  pass    
   s     col     conj  notrans     100     100     100   3.1+1.4i   0            32  0.00e+00   0.000718         2.786      6.00e-05        33.332  pass    
   s     col     conj  notrans     100      50      50   3.1+1.4i   0            32  0.00e+00   0.000136         3.676      1.00e-05        50.007  pass    
   s     col     conj  notrans      50     100     100   3.1+1.4i   0            32  0.00e+00   0.000416         2.404      1.80e-05        55.565  pass    
   s     col     conj  notrans      25      50      75   3.1+1.4i   0            32  0.00e+00   6.50e-05         2.885      1.00e-05        18.752  pass    
   s     col     conj    trans     100     100     100   3.1+1.4i   0            32  0.00e+00   0.000714         2.801      7.00e-05        28.572  pass    
   s     col     conj    trans     100      50      50   3.1+1.4i   0            32  0.00e+00   0.000131         3.817      1.20e-05        41.670  pass    
   s     col     conj    trans      50     100     100   3.1+1.4i   0            32  0.00e+00   0.000423         2.364      2.00e-05        50.007  pass    
   s     col     conj    trans      25      50      75   3.1+1.4i   0            32  0.00e+00   7.10e-05         2.641      7.00e-06        26.786  pass    
   s     col     conj     conj     100     100     100   3.1+1.4i   0            32  0.00e+00   0.000713         2.805      6.90e-05        28.986  pass    
   s     col     conj     conj     100      50      50   3.1+1.4i   0            32  0.00e+00   0.000135         3.704      1.10e-05        45.451  pass    
   s     col     conj     conj      50     100     100   3.1+1.4i   0            32  0.00e+00   0.000359         2.786      2.10e-05        47.620  pass    
   s     col     conj     conj      25      50      75   3.1+1.4i   0            32  0.00e+00   5.90e-05         3.178      1.40e-05        13.393  pass    

   z     col  notrans  notrans     100     100     100   3.1+1.4i   0            32  1.27e-17   0.000724        11.050      0.000201        39.801  pass    
   z     col  notrans  notrans     100      50      50   3.1+1.4i   0            32  1.27e-17   0.000184        10.870      8.20e-05        24.390  pass    
   z     col  notrans  notrans      50     100     100   3.1+1.4i   0            32  1.28e-17   0.000372        10.753      0.000108        37.037  pass    
   z     col  notrans  notrans      25      50      75   3.1+1.4i   0            32  1.27e-17   7.10e-05        10.563      6.00e-05        12.500  pass    
   z     col  notrans    trans     100     100     100   3.1+1.4i   0            32  1.28e-17   0.000754        10.610      0.000127        62.993  pass    
   z     col  notrans    trans     100      50      50   3.1+1.4i   0            32  1.27e-17   0.000190        10.526      0.000114        17.544  pass    
   z     col  notrans    trans      50     100     100   3.1+1.4i   0            32  1.26e-17   0.000362        11.050      0.000100        40.001  pass    
   z     col  notrans    trans      25      50      75   3.1+1.4i   0            32  1.26e-17   7.10e-05        10.563      5.80e-05        12.931  pass    
   z     col  notrans     conj     100     100     100   3.1+1.4i   0            32  1.31e-17   0.000740        10.811      0.000139        57.555  pass    
   z     col  notrans     conj     100      50      50   3.1+1.4i   0            32  1.28e-17   0.000183        10.929      8.30e-05        24.097  pass    
   z     col  notrans     conj      50     100     100   3.1+1.4i   0            32  1.29e-17   0.000381        10.499      0.000102        39.215  pass    
   z     col  notrans     conj      25      50      75   3.1+1.4i   0            32  1.26e-17   7.20e-05        10.417      6.10e-05        12.295  pass    
   z     col    trans  notrans     100     100     100   3.1+1.4i   0            32  1.29e-17   0.000773        10.349      0.000126        63.492  pass    
   z     col    trans  notrans     100      50      50   3.1+1.4i   0            32  1.29e-17   0.000184        10.870      0.000101        19.802  pass    
   z     col    trans  notrans      50     100     100   3.1+1.4i   0            32  1.28e-17   0.000394        10.152      0.000105        38.095  pass    
   z     col    trans  notrans      25      50      75   3.1+1.4i   0            32  1.38e-17   7.30e-05        10.274      5.60e-05        13.393  pass    
   z     col    trans    trans     100     100     100   3.1+1.4i   0            32  1.28e-17   0.000890         8.989      0.000134        59.700  pass    
   z     col    trans    trans     100      50      50   3.1+1.4i   0            32  1.28e-17   0.000185        10.811      9.30e-05        21.506  pass    
   z     col    trans    trans      50     100     100   3.1+1.4i   0            32  1.27e-17   0.000417         9.592      0.000101        39.604  pass    
   z     col    trans    trans      25      50      75   3.1+1.4i   0            32  1.36e-17   7.90e-05         9.494      5.80e-05        12.931  pass    
   z     col    trans     conj     100     100     100   3.1+1.4i   0            32  1.27e-17   0.000898         8.909      0.000136        58.824  pass    
   z     col    trans     conj     100      50      50   3.1+1.4i   0            32  1.30e-17   0.000186        10.753      0.000105        19.048  pass    
   z     col    trans     conj      50     100     100   3.1+1.4i   0            32  1.28e-17   0.000450         8.889      9.60e-05        41.666  pass    
   z     col    trans     conj      25      50      75   3.1+1.4i   0            32  1.31e-17   7.40e-05        10.135      5.50e-05        13.636  pass    
   z     col     conj  notrans     100     100     100   3.1+1.4i   0            32  1.27e-17   0.000831         9.627      0.000150        53.334  pass    
   z     col     conj  notrans     100      50      50   3.1+1.4i   0            32  1.32e-17   0.000186        10.753      8.50e-05        23.529  pass    
   z     col     conj  notrans      50     100     100   3.1+1.4i   0            32  1.28e-17   0.000405         9.876      9.30e-05        43.012  pass    
   z     col     conj  notrans      25      50      75   3.1+1.4i   0            32  1.31e-17   7.20e-05        10.417      6.40e-05        11.719  pass    
   z     col     conj    trans     100     100     100   3.1+1.4i   0            32  1.29e-17   0.000851         9.401      0.000124        64.516  pass    
   z     col     conj    trans     100      50      50   3.1+1.4i   0            32  1.29e-17   0.000191        10.471      0.000124        16.129  pass    
   z     col     conj    trans      50     100     100   3.1+1.4i   0            32  1.29e-17   0.000428         9.346      9.70e-05        41.237  pass    
   z     col     conj    trans      25      50      75   3.1+1.4i   0            32  1.34e-17   7.70e-05         9.740      6.30e-05        11.905  pass    
   z     col     conj     conj     100     100     100   3.1+1.4i   0            32  1.28e-17   0.000839         9.535      0.000133        60.150  pass    
   z     col     conj     conj     100      50      50   3.1+1.4i   0            32  1.30e-17   0.000178        11.236      0.000131        15.267  pass    
   z     col     conj     conj      50     100     100   3.1+1.4i   0            32  1.29e-17   0.000413         9.685      0.000113        35.398  pass    
   z     col     conj     conj      25      50      75   3.1+1.4i   0            32  1.31e-17   7.30e-05        10.274      6.60e-05        11.363  pass    
All tests passed for gemm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant