32nd International Symposium on Lattice Field Theory (Lattice 2014)

Name: 32nd International Symposium on Lattice Field Theory (Lattice 2014)
Start: 2014-06-23T08:00:00-04:00
End: 2014-06-28T18:00:00-04:00
Location: Columbia University

23–28 Jun 2014

Columbia University

US/Eastern timezone

Support

nkelly@bnl.gov

Achieving strong scaling in many-GPU calculations in lattice QCD

23 Jun 2014, 17:30

20m

415 Schapiro

Talk Algorithms and Machines Algorithms and Machines

Dr Justin Foley (Microway and Nvidia)

We describe recent additions to the QUDA software library that are aimed at extending strong scaling in multi-GPU lattice calculations. These include the addition of CPU-thread support in order to increase concurrency and improve the overlap of computation and communication in Krylov solver routines, as well as the modifications needed to enable the GPUDirect RDMA feature recently introduced by NVIDIA and Mellanox. However, we focus in particular on the implementation and performance of so-called S-step variants of common Krylov solvers on current NVIDIA hardware. The S-step formulations are designed to reduce the number of global synchronizations associated with the calculation of vector inner products. These formulations may, when combined with communication-reducing methods such as additive Schwarz preconditioning, form the basis for a set of optimal Krylov solvers for many-GPU calculations.

Dr Justin Foley (Microway and Nvidia)

Dr Mike Clark (Nvidia)

Slides

jfoley_talk.pdf

32nd International Symposium on Lattice Field Theory (Lattice 2014)

Support

Achieving strong scaling in many-GPU calculations in lattice QCD

415 Schapiro

Speaker

Description

Author

Co-author

Presentation materials

Choose timezone

32nd International Symposium on Lattice Field Theory (Lattice 2014)

Support

Speaker

Description

Author

Co-author

Presentation materials