Computer Science ETDs

Publication Date

Spring 5-1-2017

Abstract

Networks are the backbone of modern HPC systems. They serve as a critical piece of infrastructure, tying together applications, analytics, storage and visualization. Despite this importance, we have not fully explored how evolving communication paradigms and network design will impact scientific workloads. As networks expand in the race towards Exascale (1×10^18 floating point operations a second), we need to reexamine this relationship so that the HPC community better understands (1) characteristics and trends in HPC communication; (2) how to best design HPC networks to save power or enhance the performance; (3) how to facilitate scalable, informed, and dynamic decisions within the network. My thesis is that one can improve application performance and system power usage by gaining a detailed understanding of HPC communication on both the network endpoints and fabric; specifically, I address the problem of network-induced memory contention, quantify the power/performance tradeoffs for dragonfly topologies in HPC networks, and increase the scalability/responsiveness of large-scale network monitoring. This dissertation highlights opportunities for improving network performance and power efficiency, while uncovering pitfalls and mitigation strategies brought about by shifting trends in HPC communication and fabric design. We begin by examining the communication characteristics of the network endpoints. We show how one-sided communication techniques can lead to contention in the memory subsystem with (3X increases to runtime) and how this can be avoided. Then, we move onto a macro level study of the network fabric, where we demonstrate the tradeoffs between power and performance when designing HPC network topology. Lastly, in order to facilitate dynamic and responsive solutions, we provide new methods for scalable network monitoring and improved models of data aggregation.

Language

English

Keywords

HPC, Network Congestion, Exascale, Simulation, NiMC, SAI

Document Type

Dissertation

Degree Name

Computer Science

Level of Degree

Doctoral

Department Name

Department of Computer Science

First Committee Member (Chair)

Dorian Arnold

Second Committee Member

Dave Ackley

Third Committee Member

Patrick Bridges

Fourth Committee Member

Wennie Shu

Share

COinS