Electrical and Computer Engineering ETDs

Publication Date

Spring 5-1-2023


This thesis presents a hardware architecture for performing matrix multiplication via a systolic array to reduce time complexity and power consumption. The proposed architecture, the Neural Network Accelerator (NNA), was designed in Verilog HDL to perform 8-bit multiplication to reduce the resources required to implement the NNA on low-power FPGAs. The NNA’s open architecture is designed to support radiation test for fault tolerant designs targeting space applications. Commercial hardware architecture information is not public knowledge, which led us to build our own matrix multiplication architecture so that we could later study its feasibility for space applications.

The NNA was compared against two matrix multiplication architectures (SARS and DAE) developed for FPGAs and one architecture (EES) developed for an ASIC flow. The SARS architecture which uses a systolic array design achieved a max operating frequency of 210.2 MHz on a Spartan-3E FPGA, versus the NNA architecture which achieved 225 MHz on a Kintex-7 FPGA. The DAE architecture required 3,681 clock cycles for matrix multiplication compared to the NNA architecture which required 449 clock cycles. The NNA architecture includes unique features: an 8-bit Instruction Set Architecture (ISA) to control ALU operations and data-flow through each node of the systolic array, a neural network activation function (ReLU) module, 16-bit to 8-bit scaling of ALU results, and a max systolic array size of 255 x 255.


Matrix multiplication, systolic array, FPGA, accelerator

Document Type




Degree Name

Computer Engineering

Level of Degree


Department Name

Electrical and Computer Engineering

First Committee Member (Chair)

Dr. Marios Pattichis

Second Committee Member

Dr. Xiang Sun

Third Committee Member

Dr. Alonzo Vera