Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdgemm #2

Open
ViralBShah opened this issue Feb 17, 2015 · 4 comments
Open

pdgemm #2

ViralBShah opened this issue Feb 17, 2015 · 4 comments
Assignees

Comments

@ViralBShah
Copy link
Member

Would it be possible to hook up pdgemm?

It would be nice to compare a Julia SUMMA implementation with the one in scalapack/elemental.

@andreasnoack andreasnoack self-assigned this Feb 19, 2015
@andreasnoack
Copy link
Member

I can do that. It shouldn't be that hard. I have also just figured out how the redistribute functions in ScaLAPACK work so it might also be possible to use this from DArrays and still get a reasonable performance.

But I don't know what a SUMMA is.

@ViralBShah
Copy link
Member Author

Thanks. SUMMA is an outer product formulation of matrix multiply that is efficient in parallel.

http://www.netlib.org/lapack/lawnspdf/lawn96.pdf

@andreasnoack
Copy link
Member

I've pushed some wrapper code to the anj/gemm branch so you can try it out if you'd like. If you want to try it out with DArrays you'd have to merge my anj/darray Julia branch first because the DArrays have to be laid out in a certain way. However, you don't need to pay much attention to how they are laid out after that, because the wrapper redistributes back and forth on the fly. Hence you can do

julia> using MPI

julia> @everywhere using ScaLAPACK

julia> manager = MPIManager(np = 64)
MPI.MPIManager(64,`mpirun -np 64 --output-filename /tmp/user/1021/juliaUhh3oE`,"/tmp/user/1021/juliaUhh3oE",60,Dict{Int64,Int64}(),Dict{Int64,Int64}(),RemoteRef(1,1,7852),false)

julia> addprocs(manager);

julia> @everywhere using ScaLAPACK

julia> A = drandn(5000,5000);

julia> B = drandn(5000,5000);

julia> C = dzeros(5000,5000);

julia> @time ScaLAPACK.A_mul_B!(1.0, A, B, 0.0, C, 100, 100);
elapsed time: 3.871655318 seconds (8 MB allocated)

The last two arguments are the row and column size of the blocks in the block-cyclic distributions.

@ViralBShah
Copy link
Member Author

Cc: @amitmurthy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants