DiamondTorre Algorithm for High-Performance Wave Modeling
Vadim Levchenko, Anastasia Perepelkina, Andrey Zakirov
Effective algorithms of physical media numerical modeling problems
solution are discussed. Computation rate of such problems is limited
by memory bandwidth if implemented with traditional algorithms. The
numerical solution of wave equation is considered. Finite difference
scheme with cross stencil and high order of approximation is used. The
DiamondTorre algorithm is constructed, with regard for the specifics
of GPGPU’s (general purpose graphical processing unit) memory
hierarchy and parallelism. The advantages of these algorithms are high
level of data localization as well as the property of asynchrony,
which allows to effectively utilize all levels of GPGPU
parallelism. Computational in- tensity of the algorithm is greater
than the one for the best traditional algorithms with stepwise
synchronization. As a consequence, it becomes possible to overcome the
above- mentioned limitation. The algorithm is implemented with
CUDA.For the scheme with second order of approximation the calculation
performance of 50 billion cells per second is achieved, which exceeds
the result of the best traditional algorithm by a factor of 5.