Architectural Design of 3D NAND Flash based Compute-in-Memory for Inference Engine

Wonbo Shim, Hongwu Jiang, Xiaochen Peng, Shimeng Yu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

3D NAND Flash memory has been proposed as an attractive candidate of inference engine for deep neural network (DNN) owing to its ultra-high density and commercially matured fabrication technology. However, the peripheral circuits require to be modified to enable compute-in-memory (CIM) and the chip architectures need to be redesigned for an optimized dataflow. In this work, we present a design of 3D NAND-CIM accelerator based on the macro parameters from an industry-grade prototype chip. The DNN inference performance is evaluated using the DNN+ NeuroSim framework. To exploit the ultra-high density of 3D NAND Flash, both inputs and weights duplication strategies are introduced to improve the throughput. The benchmarking on a variety of VGG and ResNet networks was performed across technological candidates for CIM including SRAM, RRAM and 3D NAND. Compared to similar designs with SRAM or RRAM, the result shows that 3D NAND based CIM design can achieve not only 17-24% chip size but also 1.9-2.7 times more competitive energy efficiency for 8-bit precision inference.

Original languageEnglish
Title of host publicationMEMSYS 2020 - Proceedings of the International Symposium on Memory Systems
PublisherAssociation for Computing Machinery
Pages77-85
Number of pages9
ISBN (Electronic)9781450388993
DOIs
StatePublished - 28 Sep 2020
Event2020 International Symposium on Memory Systems, MEMSYS 2020 - Washington, United States
Duration: 28 Sep 20201 Oct 2020

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2020 International Symposium on Memory Systems, MEMSYS 2020
Country/TerritoryUnited States
CityWashington
Period28/09/201/10/20

Keywords

  • 3D NAND Flash
  • Deep neural network
  • compute-in-memory
  • hardware accelerator

Fingerprint

Dive into the research topics of 'Architectural Design of 3D NAND Flash based Compute-in-Memory for Inference Engine'. Together they form a unique fingerprint.

Cite this