An efficient implementation of the lattice Boltzmann method for hybrid supercomputers

Authors

  • D.A. Bikulov Lomonosov Moscow State University

DOI:

https://doi.org/10.26089/NumMet.v16r221

Keywords:

high-performance computing, graphics processing unit, lattice Boltzmann method, CUDA, multi-gpu, scalability

Abstract

A number of features of an efficient implementation of the lattice Boltzmann method (LBM) for hybrid supercomputers with many graphics processing units (GPU) are discussed. The main strategies for reducing the memory space required by LBM are described. The performance dependence of the implemented solver on the number of the GPUs in use is analyzed for the Lomonosov supercomputer installed at Moscow State University.

Author Biography

D.A. Bikulov

References

  1. D. A. Bikulov, D. S. Senin, D. S. Demin, et al., “Implementation of the Lattice Boltzmann Method on GPU Clusters,” Vychisl. Metody Programm. 13, 13-19 (2012).
  2. D. A. Bikulov and D. S. Senin, “Implementation of the Lattice Boltzmann Method without Stored Distribution Functions on GPU,” Vychisl. Metody Programm. 14, 370-374 (2013).
  3. N. E. Grachev, A. V. Dmitriev, and D. S. Senin, “Simulation of Gas Dynamics with the Lattice Boltzmann Method,” Vychisl. Metody Programm. 12, 227-231 (2011).
  4. A. M. Zakharov, D. S. Senin, and E. A. Grachev, “Flow Simulation by the Lattice Boltzmann Method with Multiple-Relaxation Times,” Vychisl. Metody Programm. 15, 644-657 (2014).
  5. G. V. Krivovichev, “Stability Analysis of the Lattice Boltzmann Schemes for Solving the Diffusion Equation,” Vychisl. Metody Programm. 14, 175-182 (2013).
  6. G. V. Krivovichev, “A Lattice Boltzmann Scheme for Computing on Unstructured Meshes,” Vychisl. Metody Programm. 14, 524-532 (2013).
  7. A. L. Kupershtokh, “The Lattice Boltzmann Method for the Simulation of Two-Phase Liquid-Vapor Systems,” Sovremen. Nauka: Issled., Idei, Resul’taty, Tekhnol., No. 4, 56-63 (2010).
  8. A. L. Kupershtokh, “Three-Dimensional Simulations of Two-Phase Liquid-Vapor Systems on GPU Using the Lattice Boltzmann Method,” Vychisl. Metody Programm. 13, 130-138 (2012).
  9. A. Banari, C. Janssen, S. T. Grilli, and M. Krafczyk, “Efficient GPGPU Implementation of a Lattice Boltzmann Model for Multiphase Flows with High Density Ratios,” Comput. Fluids 93, 1-17 (2014).
  10. D. Bikulov, A. Saratov, and E. Grachev, “Prediction of the Permeability of Proppant Packs under Load,” Int. J. Mod. Phys. C (2015).
    doi 10.1142/S012918311550117X
  11. E. Boek, “Pore Scale Simulation of Flow in Porous Media Using Lattice-Boltzmann Computer Simulations,” in Proc. SPE Annual Technical Conference and Exhibition (ATCE 2010), Florence, Italy, September 20-22, 2010 (Curran Associates, Red Hook, 2010), Vol. 6, pp. 5119-5132.
  12. J. Boyd, J. Buick, J. A. Cosgrove, and P. Stansell, “Application of the Lattice Boltzmann Model to Simulated Stenosis Growth in a Two-Dimensional Carotid Artery,” Phys. Med. Biol. 50 (20), 4783-4796.
  13. J. Boyd, J. Buick, and S. Green, “A Second-Order Accurate Lattice Boltzmann Non-Newtonian Flow Model,” J. Phys. A: Math. Gen. 39 (46), 14241-14247 (2006).
  14. L.-C. Chang, E. El-Araby, V. Q. Dang, and L. H. Dao, “GPU Acceleration of Nonlinear Diffusion Tensor Estimation Using CUDA and MPI,” Neurocomputing 135, 328-338 (2014).
  15. J. Cheng, M. Grossman, and T. McKercher, Professional CUDA C Programming (Wrox, New York, 2014).
  16. D. d’Humiéres, I. Ginzburg, M. Krafczyk, et al., “Multiple-Relaxation-Time Lattice Boltzmann Models in Three Dimensions,” Phil. Trans. R. Soc. Lond. A 360, 437-451 (2002).
  17. C. Feichtinger, J. Habich, H. Köstler, et al., “Performance Modeling and Analysis of Heterogeneous Lattice Boltzmann Simulations on CPU-GPU Clusters,” Parallel Comput. 46, 1-13 (2014).
  18. A. Galizia, D. d’Agostino, and A. Clematis, “An MPI-CUDA Library for Image Processing on HPC Architectures,” J. Comput. Appl. Math. 273, 414-427 (2015).
  19. P.-Y. Hong, L.-M. Huang, L.-S. Lin, and C.-A. Lin, “Scalable Multi-Relaxation-Time Lattice Boltzmann Simulations on Multi-GPU Cluster,” Comput. Fluids 110, 1-8 (2014).
  20. M. K. Ikeda, P. R. Rao, and L. A. Schaefer, “A Thermal Multicomponent Lattice Boltzmann Model,” Comput. Fluids 101, 250-262 (2014).
  21. Z. Jiang, K. Wu, G. D. Couples, and J. Ma, “The Impact of Pore Size and Pore Connectivity on Single-Phase Fluid Flow in Porous Media,” Adv. Eng. Mater. 13 (3), 208-215 (2011).
  22. Q. Kang, D.  Zhang, P. C. Lichtner, and I. N. Tsimpanogiannis, “Lattice Boltzmann Model for Crystal Growth from Supersaturated Solution,” Geophys. Res. Lett. 31 (21), L21604-1-L21604-5 (2004).
  23. S. Leclaire, M. Reggio, and J.-Y. Trépanier, “Isotropic Color Gradient for Simulating Very High-Density Ratios with a Two-Phase Flow Lattice Boltzmann Model,” Comput. Fluids 48 (1), 98-112 (2011).
  24. C. Y. Lim, C. Shu, X. D. Niu, and Y. T. Chew, “Application of Lattice Boltzmann Method to Simulate Microchannel Flows,” Phys. Fluids 14 (7), 2299-3009 (2002).
  25. F. Lu, J. Song, F. Yin, and X. Zhu, “Performance Evaluation of Hybrid Programming Patterns for Large CPU/GPU Heterogeneous Clusters,” Comput. Phys. Commun. 183 (6), 1172-1181 (2012).
  26. J. E. McClure, J. F. Prins, and C. T. Miller, “A Novel Heterogeneous Algorithm to Simulate Multiphase Flow in Porous Media on Multicore CPU-GPU Systems,” Comput. Phys. Commun. 185 (7), 1865-1874 (2014).
  27. NVIDIA. NVIDIA’s next generation CUDA compute architecture.
    http://www.nvidia.com/content/pdf/fermi_white_papers/nvidia_fermi_compute_architecture_whitepaper.pdf, Cited March 17, 2015.
  28. C. Obrecht, F. Kuznik, B. Tourancheau, and J.-J. Roux, “The TheLMA Project: Multi-GPU Implementation of the Lattice Boltzmann Method,” Int. J. High Perform. Comput. Appl. 25 (3), 295-303 (2011).
  29. C. Obrecht, F. Kuznik, B. Tourancheau, and J.-J. Roux, “Multi-GPU Implementation of the Lattice Boltzmann Method,” Comput. Math. Appl. 65 (2), 252-261 (2011).
  30. Moscow University Supercomputing Center.
    http://parallel.ru/cluster . Cited March 17, 2015.
  31. C. Pereira, A. Mól, A. Heimlich, et al., “Development and Performance Analysis of a Parallel Monte Carlo Neutron Transport Simulation Program for GPU-Cluster Using MPI and CUDA Technologies,” Prog. Nucl. Energy 65, 88-94 (2013).
  32. D. Raabe, “Overview of the Lattice Boltzmann Method for Nano- and Microscale Fluid Dynamics in Materials Science and Engineering,” Model. Simul. Mater. Sci. Eng. 12 (6), R13-R46 (2004).
  33. P. S. Rakić, D. D. Milašinović, Ž. Živanov, et al., Parallelization of a Finite-Strip Program for Geometric Nonlinear Analysis: A Hybrid Approach,” Adv. Eng. Softw. 42 (5), 273-285 (2011).
  34. J. Tölke, “Implementation of a Lattice Boltzmann Kernel Using the Compute Unified Device Architecture Developed by nVIDIA,” Comput. Visual. Sci. 13 (1), 29-39 (2010).
  35. M. Wittmann, T. Zeiser, G. Hager, and G. Wellein, “Comparison of Different Propagation Steps for Lattice Boltzmann Methods,” Comput. Math. Appl. 65 (6), 924-935 (2013).
  36. W. Xian and A. Takayuki, “Multi-GPU Performance of Incompressible Flow Computation by Lattice Boltzmann Method on GPU Cluster,” Parallel Comput. 37 (9), 521-535 (2011).
  37. C.-T. Yang, C.-L. Huang, and C.-F. Lin, “Hybrid CUDA, OpenMP, and MPI Parallel Programming on Multicore GPU Clusters,” Computer Phys. Commun. 182 (1), 266-269 (2011).
  38. H. Yoshida and M. Nagaoka, “Multiple-Relaxation-Time Lattice Boltzmann Model for the Convection and Anisotropic Diffusion Equation,” J. Comput. Phys. 229 (20), 7774-7795 (2010).

Published

24-04-2015

How to Cite

Бикулов Д. An Efficient Implementation of the Lattice Boltzmann Method for Hybrid Supercomputers // Numerical Methods and Programming (Vychislitel’nye Metody i Programmirovanie). 2015. 16. 205-214. doi 10.26089/NumMet.v16r221

Issue

Section

Section 1. Numerical methods and applications