TY - JOUR
T1 - Ferroelectric Field-Effect Transistor-Based 3-D NAND Architecture for Energy-Efficient on-Chip Training Accelerator
AU - Shim, Wonbo
AU - Yu, Shimeng
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2021/6
Y1 - 2021/6
N2 - Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm2 area with 45% array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training.
AB - Different from the deep neural network (DNN) inference process, the training process produces a huge amount of intermediate data to compute the new weights of the network. Generally, the on-chip global buffer (e.g., SRAM cache) has limited capacity because of its low memory density; therefore, off-chip DRAM access is inevitable during the training sequences. In this work, a novel ferroelectric field-effect transistor (FeFET)-based 3-D NAND architecture for on-chip training accelerator is proposed. The reduced peripheral circuit overheads due to the low operation voltage of the FeFET device and ultrahigh density of 3-D NAND architecture enable storing and computing all the intermediate data on chip during the training process. We present a custom design of a 108-Gb chip with a 59.91-mm2 area with 45% array efficiency. The relevant data mapping schemes for weights/activations/errors that are compatible with the 3-D NAND architecture are investigated. The training performance was explored, while the ResNet-18 model is trained on this architecture with the ImageNet data set by 8-bit precision. Due to the minimized off-chip memory access, 7.76 TOPS/W of energy efficiency was achieved for 8-bit on-chip training.
KW - 3-D NAND
KW - compute-in-memory (CIM)
KW - deep neural network (DNN)
KW - ferroelectric transistor
KW - on-chip training accelerator
UR - https://www.scopus.com/pages/publications/85101467328
U2 - 10.1109/JXCDC.2021.3057856
DO - 10.1109/JXCDC.2021.3057856
M3 - Article
AN - SCOPUS:85101467328
SN - 2329-9231
VL - 7
SP - 1
EP - 9
JO - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
JF - IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
IS - 1
M1 - 9350264
ER -