TY - JOUR
T1 - Technological Design of 3D NAND-Based Compute-in-Memory Architecture for GB-Scale Deep Neural Network
AU - Shim, Wonbo
AU - Yu, Shimeng
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - In this work, a heterogeneous integration strategy of 3D NAND based compute-in-memory (CIM) architecture is proposed for large-scale deep neural networks (DNNs). While most of the reported CIM architectures today have focused on the image classification models with MB-level parameters, we aim at huge language translation models with GB-scale parameters. Our 3D NAND CIM architecture design exploits two fabrication techniques, wafer bonding scheme and CMOS under array (CUA), to integrate CMOS circuits, 3D NAND cells, and high voltage (HV) transistors at different tiers without thermal budget issue during the fabrication process. The bonding pads between two wafers are designed to transfer the input and output vectors while ensuring sim 1~μ m pitch that is feasible by hybrid bonding. The chip size of the 512 Gb 128-layer 3D NAND CIM architecture is estimated to be 166 mm2 with 7 nm FinFET logic transistors. Using the physical and electrical parameters of standard 3D NAND cells, the 1.15-19.01 tera operations per second per watt (TOPS/W) of energy efficiency is achieved.
AB - In this work, a heterogeneous integration strategy of 3D NAND based compute-in-memory (CIM) architecture is proposed for large-scale deep neural networks (DNNs). While most of the reported CIM architectures today have focused on the image classification models with MB-level parameters, we aim at huge language translation models with GB-scale parameters. Our 3D NAND CIM architecture design exploits two fabrication techniques, wafer bonding scheme and CMOS under array (CUA), to integrate CMOS circuits, 3D NAND cells, and high voltage (HV) transistors at different tiers without thermal budget issue during the fabrication process. The bonding pads between two wafers are designed to transfer the input and output vectors while ensuring sim 1~μ m pitch that is feasible by hybrid bonding. The chip size of the 512 Gb 128-layer 3D NAND CIM architecture is estimated to be 166 mm2 with 7 nm FinFET logic transistors. Using the physical and electrical parameters of standard 3D NAND cells, the 1.15-19.01 tera operations per second per watt (TOPS/W) of energy efficiency is achieved.
KW - 3D NAND flash
KW - Compute-in-memory
KW - deep neural network
KW - heterogeneous 3D integration
UR - http://www.scopus.com/inward/record.url?scp=85099105082&partnerID=8YFLogxK
U2 - 10.1109/LED.2020.3048101
DO - 10.1109/LED.2020.3048101
M3 - Article
AN - SCOPUS:85099105082
SN - 0741-3106
VL - 42
SP - 160
EP - 163
JO - IEEE Electron Device Letters
JF - IEEE Electron Device Letters
IS - 2
M1 - 9311209
ER -