"Workload balancing in GPU implementation of breadth-first search"
Chernoskutov M.A., Ermakov D.G.

Parallel processing of unstructured data пшмут in a graph-like form can be a severe computational challenge because of significant overheads caused by the irregular nature of graph algorithms and the hardware latency of intensive data access. The GPU implementation of the load balancing method that allows one to dramatically improve the parallel breadth-first search algorithm compared to its sequential analog on CPU is considered. This work was partially supported by the Russian Foundation for Basic Research (project 14–07–00435) and by UB RAS (projects 12–P–1–1029 and RCP–13–P18). Numerical experiments were performed using the "Uran" supercomputer installed at IMM UB RAS. This paper was recommended for publication by the Program Committee of the International Scientific Conference "Scientific Services and Internet: all bounds of parallelism" ((http://agora.guru.ru/abrau2013)).

Keywords: breadth-first search, parallel algorithm, graphics processing units

Chernoskutov M.A., e-mail: mach@imm.uran.ru;   Ermakov D.G., e-mail: ermak@imm.uran.ru – Krasovskii Institute of Mathematics and Mechanics, Ural Branch of Russian Academy of Sciences, ulitsa Kovalevskoi 16, Ekaterinburg, 620219, Russia