A Memory-Efficient Edge Inference Accelerator with XOR-based Model Compression

Hyunseung Lee, Jihoon Hong, Soosung Kim, Seung Yul Lee, Jae W. Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Model compression is widely adopted for edge inference of neural networks (NNs) to minimize both costly DRAM accesses and memory footprints. Recently, XOR-based model compression has demonstrated promising results to maximize compression ratio and minimize accuracy drop. However, XOR-based decompression alone produces bit errors and requires auxiliary data for error correction. To minimize model size and hence DRAM traffic, we propose an enhanced decompression algorithm and a low-cost hardware accelerator for it. Since not all errors are equal, our algorithm selects only important errors to correct with no accuracy drop. Compared with the baseline XOR compression scheme correcting all errors, the compressed model size of ResNet-18 and VGG-16 is reduced by 23% and 27% respectively. We also present a low-cost hardware implementation of on-line XOR decompression and error-correction logic built on Gemmini, an open-source systolic array accelerator, at the cost of only a 0.39% and 0.46% increase in area and power.

Original languageEnglish
Title of host publication2023 60th ACM/IEEE Design Automation Conference, DAC 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350323481
DOIs
StatePublished - 2023
Event60th ACM/IEEE Design Automation Conference, DAC 2023 - San Francisco, United States
Duration: 9 Jul 202313 Jul 2023

Publication series

NameProceedings - Design Automation Conference
Volume2023-July
ISSN (Print)0738-100X

Conference

Conference60th ACM/IEEE Design Automation Conference, DAC 2023
Country/TerritoryUnited States
CitySan Francisco
Period9/07/2313/07/23

Fingerprint

Dive into the research topics of 'A Memory-Efficient Edge Inference Accelerator with XOR-based Model Compression'. Together they form a unique fingerprint.

Cite this