TY - JOUR
T1 - System-technology codesign of 3-D NAND flash-based compute-in-memory inference engine
AU - Shim, Wonbo
AU - Yu, Shimeng
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2021/6
Y1 - 2021/6
N2 - Due to its ultrahigh density and commercially matured fabrication technology, 3-D NAND flash memory has been proposed as an attractive candidate of inference engine for deep neural network (DNN) workloads. However, the peripheral circuits require to be modified with conventional 3-D NAND flash to enable compute-in-memory (CIM), and the chip architectures need to be redesigned for an optimized dataflow. In this work, we present a design of 3-D NAND-CIM accelerator based on the macro parameters from an industry-grade prototype chip. The DNN inference performance is evaluated using the DNN+NeuroSim framework. To exploit the ultrahigh density of 3-D NAND flash, both inputs and weights mapping strategies are introduced to improve the throughput. The benchmarking on the VGG network was performed across the technological candidates for CIM, including SRAM, resistive random access memory (RRAM), and 3-D NAND. Compared to the similar designs with SRAM or RRAM, the result shows that the 3-D NAND-based CIM design can achieve not only 17%-24% chip size but also 1.9-2.7 times more competitive energy efficiency for 8-bit precision inference. Inference accuracy drop induced by 3-D NAND string current drift and variation is also investigated. No accuracy degradation by current variation was observed with the proposed input mapping scheme, while accuracy drops sensitive to the current drift, which implies that some compensation schemes are needed to maintain the inference accuracy.
AB - Due to its ultrahigh density and commercially matured fabrication technology, 3-D NAND flash memory has been proposed as an attractive candidate of inference engine for deep neural network (DNN) workloads. However, the peripheral circuits require to be modified with conventional 3-D NAND flash to enable compute-in-memory (CIM), and the chip architectures need to be redesigned for an optimized dataflow. In this work, we present a design of 3-D NAND-CIM accelerator based on the macro parameters from an industry-grade prototype chip. The DNN inference performance is evaluated using the DNN+NeuroSim framework. To exploit the ultrahigh density of 3-D NAND flash, both inputs and weights mapping strategies are introduced to improve the throughput. The benchmarking on the VGG network was performed across the technological candidates for CIM, including SRAM, resistive random access memory (RRAM), and 3-D NAND. Compared to the similar designs with SRAM or RRAM, the result shows that the 3-D NAND-based CIM design can achieve not only 17%-24% chip size but also 1.9-2.7 times more competitive energy efficiency for 8-bit precision inference. Inference accuracy drop induced by 3-D NAND string current drift and variation is also investigated. No accuracy degradation by current variation was observed with the proposed input mapping scheme, while accuracy drops sensitive to the current drift, which implies that some compensation schemes are needed to maintain the inference accuracy.
KW - 3-D NAND
KW - compute-in-memory (CIM)
KW - deep neural network (DNN)
KW - hardware accelerator
UR - http://www.scopus.com/inward/record.url?scp=85112171200&partnerID=8YFLogxK
U2 - 10.1109/JXCDC.2021.3093772
DO - 10.1109/JXCDC.2021.3093772
M3 - Article
AN - SCOPUS:85112171200
SN - 2329-9231
VL - 7
SP - 61
EP - 69
JO - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
JF - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
IS - 1
ER -