Skip to content

Commit

Permalink
Merge pull request #179 from nmnobre/tweaks
Browse files Browse the repository at this point in the history
Improvements to the readme file
  • Loading branch information
xiaoyeli authored Dec 27, 2024
2 parents 9f4c5bc + a01f331 commit 45e2432
Show file tree
Hide file tree
Showing 2 changed files with 67 additions and 67 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Run Github CI tests.
name: GitHub CI tests

on: [push, pull_request]

Expand Down
132 changes: 66 additions & 66 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# SuperLU_DIST (version 9.0.0) <img align=center width="55" alt="superlu" src="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png">
# SuperLU_DIST (version 9.1.0) <img align=center width="55" alt="superlu" src="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png">

[![Build Status](https://travis-ci.org/xiaoyeli/superlu_dist.svg?branch=master)](https://travis-ci.org/xiaoyeli/superlu_dist)
[![Build Status](https://github.com/xiaoyeli/superlu_dist/actions/workflows/test.yml/badge.svg)](https://github.com/xiaoyeli/superlu_dist/actions/workflows/test.yml)
[Nightly tests](http://my.cdash.org/index.php?project=superlu_dist)

SuperLU_DIST contains a set of subroutines to solve a sparse linear system
A*X=B. It uses Gaussian elimination with static pivoting (GESP).
SuperLU_DIST contains a set of subroutines to solve a sparse linear system
A*X=B. It uses Gaussian elimination with static pivoting (GESP).
Static pivoting is a technique that combines the numerical stability of
partial pivoting with the scalability of Cholesky (no pivoting),
to run accurately and efficiently on large numbers of processors.
to run accurately and efficiently on large numbers of processors.

SuperLU_DIST is a parallel extension to the serial SuperLU library.
It is targeted for the distributed memory parallel machines.
Expand All @@ -24,7 +24,7 @@ acceleration capabilities.
Table of Contents
=================

* [SuperLU_DIST (version 9.1.0) <a href="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png" target="_blank" rel="nofollow"><img align="center" width="55" alt="superlu" src="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png" style="max-width:100%;"></a>](#superlu_dist-version-81---)
* [SuperLU_DIST (version 9.1.0) <a href="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png" target="_blank" rel="nofollow"><img align="center" width="55" alt="superlu" src="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png" style="max-width:100%;"></a>](#superlu_dist-version-910--)
* [Directory structure of the source code](#directory-structure-of-the-source-code)
* [Installation](#installation)
* [Installation option 1: Using CMake build system.](#installation-option-1-using-cmake-build-system)
Expand All @@ -49,16 +49,16 @@ Table of Contents

Created by [gh-md-toc](https://github.com/ekalinin/github-markdown-toc)

# SuperLU_DIST (version 8.2) <img align=center width="55" alt="superlu" src="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png">
# SuperLU_DIST (version 9.1.0) <img align=center width="55" alt="superlu" src="https://user-images.githubusercontent.com/11741943/103982988-5a9a9d00-5139-11eb-9ac4-a55e80a79f8d.png">

[![Build Status](https://travis-ci.org/xiaoyeli/superlu_dist.svg?branch=master)](https://travis-ci.org/xiaoyeli/superlu_dist)
[![Build Status](https://github.com/xiaoyeli/superlu_dist/actions/workflows/test.yml/badge.svg)](https://github.com/xiaoyeli/superlu_dist/actions/workflows/test.yml)
[Nightly tests](http://my.cdash.org/index.php?project=superlu_dist)

SuperLU_DIST contains a set of subroutines to solve a sparse linear system
A*X=B. It uses Gaussian elimination with static pivoting (GESP).
SuperLU_DIST contains a set of subroutines to solve a sparse linear system
A*X=B. It uses Gaussian elimination with static pivoting (GESP).
Static pivoting is a technique that combines the numerical stability of
partial pivoting with the scalability of Cholesky (no pivoting),
to run accurately and efficiently on large numbers of processors.
to run accurately and efficiently on large numbers of processors.

SuperLU_DIST is a parallel extension to the serial SuperLU library.
It is targeted for the distributed memory parallel machines.
Expand Down Expand Up @@ -99,7 +99,7 @@ SuperLU_DIST/MAKE_INC/ sample machine-specific make.inc files
# Installation

There are two ways to install the package. The first method is to use
CMake automatic build system. The other method requires users to
CMake automatic build system. The other method requires users to
The procedures are described below.

## Installation option 1: Using CMake build system.
Expand Down Expand Up @@ -133,7 +133,7 @@ export PARMETIS_BUILD_DIR=${PARMETIS_ROOT}/build/Linux-x86_64
### Optional external libraries: CombBLAS, LAPACK

In order to use parallel weighted matching HWPM (Heavy Weight
Perfect Matching) for numerical pre-pivoting, you need to install
Perfect Matching) for numerical pre-pivoting, you need to install
CombBLAS and define the environment variable:

```
Expand Down Expand Up @@ -240,37 +240,37 @@ contains the key CPP definitions used throughout the code.
-DBUILD_SHARED_LIBS= OFF | ON
-DCMAKE_INSTALL_PREFIX=<...>.
-DCMAKE_C_COMPILER=<MPI C compiler>
-DCMAKE_C_FLAGS="..."
-DCMAKE_C_FLAGS="..."
-DCMAKE_CXX_COMPILER=<MPI C++ compiler>
-DMAKE_CXX_FLAGS="..."
-DCMAKE_CUDA_FLAGS="..."
-DHIP_HIPCC_FLAGS="..."
-DCMAKE_CUDA_FLAGS="..."
-DHIP_HIPCC_FLAGS="..."
-DXSDK_ENABLE_Fortran=OFF | ON
-DCMAKE_Fortran_COMPILER=<MPI F90 compiler>
```

## Installation option 2: Manual installation with makefile.
Before installing the package, please examine the three things dependent
Before installing the package, please examine the three things dependent
on your system setup:

### 2.1 Edit the make.inc include file.

This make include file is referenced inside each of the Makefiles
in the various subdirectories. As a result, there is no need to
in the various subdirectories. As a result, there is no need to
edit the Makefiles in the subdirectories. All information that is
machine specific has been defined in this include file.
machine specific has been defined in this include file.

Sample machine-specific make.inc are provided in the MAKE_INC/
directory for several platforms, such as Cray XT5, Linux, Mac-OS, and CUDA.
When you have selected the machine to which you wish to install
SuperLU_DIST, copy the appropriate sample include file
SuperLU_DIST, copy the appropriate sample include file
(if one is present) into make.inc.

For example, if you wish to run SuperLU_DIST on a Cray XT5, you can do
`cp MAKE_INC/make.xt5 make.inc`

For the systems other than listed above, some porting effort is needed
for parallel factorization routines. Please refer to the Users' Guide
for parallel factorization routines. Please refer to the Users' Guide
for detailed instructions on porting.

The following CPP definitions can be set in CFLAGS.
Expand All @@ -283,7 +283,7 @@ printing level to show solver's execution details. (default 0)
-DDEBUGlevel=[0,1,2,...]
diagnostic printing level for debugging purpose. (default 0)
```
```

### 2.2. The BLAS library.

Expand All @@ -299,7 +299,7 @@ the file make.inc:
BLASDEF = -DUSE_VENDOR_BLAS
BLASLIB = <BLAS library you wish to link with>
```
The CBLAS/ subdirectory contains the part of the C BLAS (single threaded)
The CBLAS/ subdirectory contains the part of the C BLAS (single threaded)
needed by SuperLU_DIST package. However, these codes are intended for use
only if there is no faster implementation of the BLAS already
available on your machine. In this case, you should go to the
Expand All @@ -312,7 +312,7 @@ top-level SuperLU_DIST/ directory and do the following:
to make the BLAS library from the routines in the
` CBLAS/ subdirectory.`

### 2.3. External libraries.
### 2.3. External libraries.

#### 2.3.1 Metis and ParMetis.

Expand Down Expand Up @@ -370,8 +370,8 @@ You can disable CombBLAS with the following line in SRC/superlu_dist_config.h:

In the header file SRC/superlu_FCnames.h, we use macros to determine how
C routines should be named so that they are callable by Fortran.
(Some vendor-supplied BLAS libraries do not have C interfaces. So the
re-naming is needed in order for the SuperLU BLAS calls (in C) to
(Some vendor-supplied BLAS libraries do not have C interfaces. So the
re-naming is needed in order for the SuperLU BLAS calls (in C) to
interface with the Fortran-style BLAS.)
The possible options for CDEFS are:
```
Expand All @@ -395,7 +395,7 @@ Add the CUDA library location in make.inc:
```
HAVE_CUDA=TRUE
INCS += -I<CUDA directory>/include
LIBS += -L<CUDA directory>/lib64 -lcublas -lcudart
LIBS += -L<CUDA directory>/lib64 -lcublas -lcudart
endif
```
A Makefile is provided in each subdirectory. The installation can be done
Expand Down Expand Up @@ -423,7 +423,7 @@ Please consult that file for detailed description of the meanings.
# Windows Usage
Prerequisites: CMake, Visual Studio, Microsoft HPC Pack
This has been tested with Visual Studio 2017, without Parmetis,
without Fortran, and with OpenMP disabled.
without Fortran, and with OpenMP disabled.

The cmake configuration line used was
```
Expand Down Expand Up @@ -456,7 +456,7 @@ If you wish to test:

# Reading sparse matrix files

The SRC/ directory contains the following routines to read different file
The SRC/ directory contains the following routines to read different file
formats, they all have the similar calling sequence.
```
$ ls -l dread*.c
Expand All @@ -471,73 +471,73 @@ dreadtriple_noheader.c : triplet, no header, which is also readable in Matlab

**[1]** X.S. Li and J.W. Demmel, "SuperLU_DIST: A Scalable Distributed-Memory
Sparse Direct Solver for Unsymmetric Linear Systems", ACM Trans. on Math.
Software, Vol. 29, No. 2, June 2003, pp. 110-140.
Software, Vol. 29, No. 2, June 2003, pp. 110-140.
**[2]** L. Grigori, J. Demmel and X.S. Li, "Parallel Symbolic Factorization
for Sparse LU with Static Pivoting", SIAM J. Sci. Comp., Vol. 29, Issue 3,
1289-1314, 2007.
1289-1314, 2007.
**[3]** P. Sao, R. Vuduc and X.S. Li, "A distributed CPU-GPU sparse direct
solver", Proc. of EuroPar-2014 Parallel Processing, August 25-29, 2014.
Porto, Portugal.
Porto, Portugal.
**[4]** P. Sao, X.S. Li, R. Vuduc, “A Communication-Avoiding 3D Factorization
for Sparse Matrices”, Proc. of IPDPS, May 21–25, 2018, Vancouver.
for Sparse Matrices”, Proc. of IPDPS, May 21–25, 2018, Vancouver.
**[5]** P. Sao, R. Vuduc, X. Li, "Communication-avoiding 3D algorithm for
sparse LU factorization on heterogeneous systems", J. Parallel and
Distributed Computing (JPDC), September 2019.
Distributed Computing (JPDC), September 2019.
**[6]** Y. Liu, M. Jacquelin, P. Ghysels and X.S. Li, “Highly scalable
distributed-memory sparse triangular solution algorithms”, Proc. of
SIAM workshop on Combinatorial Scientific Computing, June 6-8, 2018,
Bergen, Norway.
Bergen, Norway.
**[7]** N. Ding, S. Williams, Y. Liu, X.S. Li, "Leveraging One-Sided
Communication for Sparse Triangular Solvers", Proc. of SIAM Conf. on
Parallel Processing for Scientific Computing. Feb. 12-15, 2020.
Parallel Processing for Scientific Computing. Feb. 12-15, 2020.
**[8]** A. Azad, A. Buluc, X.S. Li, X. Wang, and J. Langguth,
"A distributed-memory algorithm for computing a heavy-weight perfect matching
"A distributed-memory algorithm for computing a heavy-weight perfect matching
on bipartite graphs", SIAM J. Sci. Comput., Vol. 42, No. 4, pp. C143-C168, 2020.\
**[9]** N. Ding, Y. Liu, S. Williams, X.S. Li,
"A Message-Driven, Multi-GPU Parallel Sparse Triangular Solver”,
"A Message-Driven, Multi-GPU Parallel Sparse Triangular Solver”,
Proceedings of SIAM Proceedings of ACDA21 conference, 2021.\
**[10]** Y. Liu, N. Ding, P. Sao, S. Williams, X.S. Li,
**[10]** Y. Liu, N. Ding, P. Sao, S. Williams, X.S. Li,
"Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters", Proceedings of SC23, Nov. 2023 \
**[11]** X. Li, P. Lin, Y. Liu, P. Sao, “Newly Released Capabilities in Distributed-memory SuperLU Sparse Direct Solver”,
ACM Trans. Math. Software, Volume 49, No. 1, March 2023.
https://dl.acm.org/doi/10.1145/3577197 \
**[12]** W. Boukaram, Y. Hong Y, Y. Liu, T. Shi, X.S. Li.
"Batched sparse direct solver design and evaluation in SuperLU\_DIST".
International Journal of High Performance Computing Applications. 2024;38(6):585-598.
doi:10.1177/10943420241268200
doi:10.1177/10943420241268200


**Xiaoye S. Li**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Gustavo Chavez**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Jim Demmel**, UC Berkeley, [[email protected]]([email protected])
**Nan Ding**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Xiaoye S. Li**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Gustavo Chavez**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Jim Demmel**, UC Berkeley, [[email protected]]([email protected])
**Nan Ding**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**John Gilbert**, UC Santa Barbara, [[email protected]]([email protected])
**Laura Grigori**, INRIA, France, [[email protected]]([email protected])
**Paul Lin**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Yang Liu**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Piyush Sao**, Georgia Institute of Technology, [[email protected]]([email protected])
**Meiyue Shao**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Ichitaro Yamazaki**, Univ. of Tennessee, [[email protected]]([email protected])
**Laura Grigori**, INRIA, France, [[email protected]]([email protected])
**Paul Lin**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Yang Liu**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Piyush Sao**, Georgia Institute of Technology, [[email protected]]([email protected])
**Meiyue Shao**, Lawrence Berkeley National Lab, [[email protected]]([email protected])
**Ichitaro Yamazaki**, Univ. of Tennessee, [[email protected]]([email protected])


# RELEASE VERSIONS
```
October 15, 2003   Version 2.0
October 1, 2007   Version 2.1
Feburary 20, 2008 Version 2.2
October 15, 2008   Version 2.3
June 9, 2010 Version 2.4
November 23, 2010 Version 2.5
March 31, 2013 Version 3.3
October 1, 2014 Version 4.0
July 15, 2014 Version 4.1
September 25, 2015 Version 4.2
December 31, 2015 Version 4.3
April 8, 2016 Version 5.0.0
May 15, 2016 Version 5.1.0
October 4, 2016 Version 5.1.1
December 31, 2016 Version 5.1.3
September 30, 2017 Version 5.2.0
October 15, 2003   Version 2.0
October 1, 2007   Version 2.1
Feburary 20, 2008 Version 2.2
October 15, 2008   Version 2.3
June 9, 2010 Version 2.4
November 23, 2010 Version 2.5
March 31, 2013 Version 3.3
October 1, 2014 Version 4.0
July 15, 2014 Version 4.1
September 25, 2015 Version 4.2
December 31, 2015 Version 4.3
April 8, 2016 Version 5.0.0
May 15, 2016 Version 5.1.0
October 4, 2016 Version 5.1.1
December 31, 2016 Version 5.1.3
September 30, 2017 Version 5.2.0
January 28, 2018 Version 5.3.0
June 1, 2018 Version 5.4.0
September 22, 2018 Version 6.0.0
Expand Down

0 comments on commit 45e2432

Please sign in to comment.