Computer Science ETDs

Publication Date

2021

Abstract

Understanding the performance of parallel and distributed programs remains a focal point in determining how compute systems can be optimized to achieve exascale performance. Lightweight, statistical models allow developers to both characterize and predict performance trade-offs, especially as HPC systems become more heterogeneous with many-core CPUs and GPUs. This thesis presents a lightweight, statistical modeling approach of performance variation which leverages extreme value theory by focusing on the maximum length of distributed workload intervals. This approach was implemented in MPI and evaluated on several HPC systems and workloads. I then present a performance model of partitioned communication which also uses an expected maximum value method. This performance model was validated with benchmarked results from HPC systems. These lightweight, statistical models provide insight into the behavior of HPC applications and systems and allow developers to predict performance impacts as HPC systems evolve towards exascale.

Language

English

Keywords

Exascale, HPC, Performance Model, Performance Variability, Partitioned Communication, Extreme Value Theory, Statistics, MPI

Document Type

Thesis

Degree Name

Computer Science

Level of Degree

Masters

Department Name

Department of Computer Science

First Committee Member (Chair)

Patrick G. Bridges

Second Committee Member

Trilce Estrada

Third Committee Member

Amanda Bienz

Share

COinS