Computer Science ETDs
Publication Date
Summer 6-17-2025
Abstract
High Performance Computing (HPC) applications increasingly rely on both process and thread-level parallelism to maximize performance across complex, multi-node systems. However, conventional bulk synchronous communication strategies often leave both compute and network resources underutilized due to synchronization delays. This dissertation systematically evaluates the potential of fine-grained, threaded inter-node communication as a strategy for reducing these inefficiencies. To this end, I design and develop two tools: the MiniMod modular application framework and the Configurable Messaging Benchmark (CMB), which together enable empirical, reproducible assessment of communication performance across varying application behaviors, threading models, and communication granularities. Through experiments across multiple systems and workloads, I analyze thread arrival distributions, quantify reclaimable compute time, and assess how early, asynchronous communication can overlap with computation to improve efficiency. My results demonstrate that performance benefits depend heavily on application structure, threading variability, and middleware design. This work establishes concrete criteria under which threaded fine-grained communication is advantageous, guiding future co-design of HPC applications and communication libraries.
Language
English
Document Type
Thesis
Degree Name
Computer Science
Level of Degree
Doctoral
Department Name
Department of Computer Science
First Committee Member (Chair)
Patrick G. Bridges
Second Committee Member
Amanda Bienz
Third Committee Member
Ryan Grant
Fourth Committee Member
Tony Skjellum
Recommended Citation
Marts, William Pepper. "A SYSTEMATIC EVALUATION OF THREADED INTERNODE COMMUNICATION IN HPC." (2025). https://digitalrepository.unm.edu/cs_etds/137