Science Cast

cuRAMSES: Scalable AMR Optimizations for Large-Scale Cosmological Simulations

librarianApril 8, 2026 3:13am

Views (11)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

cuRAMSES: Scalable AMR Optimizations for Large-Scale Cosmological Simulations

arXivPDFApril 7, 2026 12:00am

Authors

Juhan Kim

Abstract

We present cuRAMSES, a suite of advanced domain decomposition strategies and algorithmic optimizations for the ramses adaptive mesh refinement (AMR) code, designed to overcome the communication, memory, and solver bottlenecks inherent in massive cosmological simulations. The central innovation is a recursive k-section domain decomposition that replaces the traditional Hilbert curve ordering with a hierarchical spatial partitioning. This approach substitutes global all-to-all communications with neighbour-only point-to-point communications. By maintaining a constant number of communication partners regardless of the total rank count, it significantly improves strong scaling at high concurrency. To address critical memory constraints at scale, we introduce a Morton-key hash table for octree-neighbour lookup alongside on-demand array allocation, drastically reducing the per-rank memory footprint. Furthermore, a novel spatial hash-binning algorithm in box-type local domains accelerates supernova and AGN feedback routines by over two orders of magnitude (an about 260 times speedup). For hybrid architectures, an automatic CPU/GPU dispatch model with GPU-resident mesh data is implemented and benchmarked. The multigrid Poisson solver achieves a 1.7 times GPU speedup on H100 and A100 GPUs, although the Godunov solver is currently PCIe-bandwidth-limited. The net improvement is about 20 per cent on current PCIe-connected hardware, and a performance model predicts about 2 times on tightly coupled architectures such as the NVIDIA GH200. Additionally, a variable-Nrank restart capability enables flexible I/O workflows. Extensive diagnostics verify that all modifications preserve mass, momentum, and energy conservation, matching the reference Hilbert-ordering run to within 0.5 per cent in the total energy diagnostic.

TwitterandLinkedIn

0 comments

Add comment

cuRAMSES: Scalable AMR Optimizations for Large-Scale Cosmological Simulations

cuRAMSES: Scalable AMR Optimizations for Large-Scale Cosmological Simulations

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments