This thesis presents a hardware architecture for performing matrix multiplication via a systolic array to reduce time complexity and power consumption. The proposed architecture, the Neural Network Accelerator (NNA), was designed in Verilog HDL to perform 8-bit multiplication to reduce the resources required to implement the NNA on low-power FPGAs. The NNA’s open architecture is designed to support radiation test for fault tolerant designs targeting space applications. Commercial hardware architecture information is not public knowledge, which led us to build our own matrix multiplication architecture so that we could later study its feasibility for space applications.
The NNA was compared against two matrix multiplication architectures (SARS and DAE) developed for FPGAs and one architecture (EES) developed for an ASIC flow. The SARS architecture which uses a systolic array design achieved a max operating frequency of 210.2 MHz on a Spartan-3E FPGA, versus the NNA architecture which achieved 225 MHz on a Kintex-7 FPGA. The DAE architecture required 3,681 clock cycles for matrix multiplication compared to the NNA architecture which required 449 clock cycles. The NNA architecture includes unique features: an 8-bit Instruction Set Architecture (ISA) to control ALU operations and data-flow through each node of the systolic array, a neural network activation function (ReLU) module, 16-bit to 8-bit scaling of ALU results, and a max systolic array size of 255 x 255.
Matrix multiplication, systolic array, FPGA, accelerator
Level of Degree
Electrical and Computer Engineering
First Committee Member (Chair)
Dr. Marios Pattichis
Second Committee Member
Dr. Xiang Sun
Third Committee Member
Dr. Alonzo Vera
Love, Jeffrey. "A Reconfigurable Architecture for Matrix Multiplication for Low Power Applications." (2023). https://digitalrepository.unm.edu/ece_etds/580