Effective zero compression on ReRAM-based sparse DNN accelerators

Hoon Shin, Rihae Park, Seung Yul Lee, Yeonhong Park, Hyunseung Lee, Jae W. Lee

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

For efficient DNN inference Resistive RAM (ReRAM) crossbars have emerged as a promising building block to compute matrix multiplication in an area- and power-efficient manner. To improve inference throughput sparse models can be deployed on the ReRAM-based DNN accelerator. While unstructured pruning maintains both high accuracy and high sparsity, it performs poorly on the crossbar architecture due to the irregular locations of pruned weights. Meanwhile, due to the non-ideality of ReRAM cells and the high cost of ADCs, matrix multiplication is usually performed at a fine granularity, called Operation Unit (OU), along both wordline and bitline dimensions. While fine-grained, OU- based row compression (ORC) has recently been proposed to increase weight compression ratio, significant performance potentials are still left on the table due to sub-optimal weight mappings. Thus, we propose a novel weight mapping scheme that effectively clusters zero weights via OU-level filter reordering, hence improving the effective weight compression ratio. We also introduce a weight recovery scheme to further improve accuracy or compression ratio, or both. Our evaluation with three popular DNNs demonstrates that the proposed scheme effectively eliminates redundant weights in the crossbar array and hence ineffectual computation to achieve 3.27 - 4.26× of array compression ratio with negligible accuracy loss over the baseline ReRAM-based DNN accelerator.

Original languageEnglish
Title of host publicationProceedings of the 59th ACM/IEEE Design Automation Conference, DAC 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages949-954
Number of pages6
ISBN (Electronic)9781450391429
DOIs
StatePublished - 10 Jul 2022
Event59th ACM/IEEE Design Automation Conference, DAC 2022 - San Francisco, United States
Duration: 10 Jul 202214 Jul 2022

Publication series

NameProceedings - Design Automation Conference
ISSN (Print)0738-100X

Conference

Conference59th ACM/IEEE Design Automation Conference, DAC 2022
Country/TerritoryUnited States
CitySan Francisco
Period10/07/2214/07/22

Fingerprint

Dive into the research topics of 'Effective zero compression on ReRAM-based sparse DNN accelerators'. Together they form a unique fingerprint.

Cite this