"An accelerated topology change algorithm for the contour advection method"
Bogdanov P.B., Efremov A.A., Gorobets A.V., Sukov S.A.

An optimization of communications within multi-level parallelization based on a combination of MPI, OpenMP and OpenCL is proposed to fit all kinds of modern supercomputer architectures including hybrid systems with GPU and Intel Xeon Phi accelerators. A general-purpose scheduler that simplifies a heterogeneous implementation is discussed. The scheduler controls queues of computing and communication OpenCL tasks. It uses an interdependency graph of a target computing algorithm as input. The use of the scheduler is demonstrated on an example of a finite-volume CFD algorithm for unstructured meshes. In particular, the scheduler has been used to simplify an implementation of an overlapped communication scheme. The implementation of the CFD algorithm with MPI and CPU-GPU communications overlapped with computations is described and its parallel efficiency is demonstrated.

Keywords: gas dynamics, scheduler, parallel computing, GPU, OpenCL, MPI

Bogdanov P.B., e-mail: bogdanov@niisi.msk.ru;   Efremov A.A.   e-mail: antonyef@mail.ru; – Scientific Research Institute for System Studies, Russian Academy of Sciences; prospect Nakhimovskii 36 -1, Moscow, 117218, Russia
Gorobets A.V., e-mail: cherepock@gmail.com;   Sukov S.A.   e-mail: ssoukov@gmail.com – Keldysh Institute for Applied Mathematics, Russian Academy of Sciences; Miusskaya ploshchad 4a, Moscow, 125047, Russia