Computer Science ETDs

Publication Date

Summer 7-29-2025

Abstract

Boundary exchanges dominate the cost of both stenciled codes and those that rely on sparse matrix operations. The performance of large boundary exchanges is limited by synchronization overheads and injection bandwidth limitations. Irregular boundary exchanges incur additional overheads due to the large number of required messages. This thesis investigates multiple methods for improving the performance and scalability of both Cartesian and irregular boundary exchanges. Since boundary exchanges are typically performed iteratively, persistent communication presents an opportunity for optimization by sharing and amortizing setup costs. Partitioned communication is also explored to increase asynchrony, reducing bottlenecks from synchronization overheads and data congestion. For irregular applications with larger numbers of messages impacting their performance, aggregation can avoid high latency messages and neighborhood collectives can provide these optimizations portably. Finally, increasing asynchrony in large irregular boundary exchanges can alleviate synchronization bottlenecks that are amplified by load imbalance. Synchronization is reduced with partitioned communication and an alternate CSC matrix format, enabling increased overlap of communication and computation for better system utilization. For regular halo exchanges, measured timings show that persistent MPI communication can provide a speedup of up to 37% over the baseline MPI communication, and partitioned MPI communication can provide a speedup of up to 68%. Additionally, results from hypre BoomerAMG show up to a 38% speedup on sparse matrix-vector multiplication using aggregating neighbor collectives in linear solvers. Last, benchmark tests using partitioned MPI and the CSC matrix format demonstrate improvement to sparse matrix-dense matrix multiplication on SuiteSparse matrices by up to 190%.

Language

English

Keywords

HPC, MPI, Boundary Exchanges, Irregular Communication, Sparse Matrix Operations

Document Type

Dissertation

Degree Name

Computer Science

Level of Degree

Doctoral

Department Name

Department of Computer Science

First Committee Member (Chair)

Amanda Bienz

Second Committee Member

Patrick Bridges

Third Committee Member

Anthony Skjellum

Fourth Committee Member

Rui Peng Li

Share

COinS