Optimizing Deep Neural Network Precision for Processing-in-Memory: A Memory Bottleneck Perspective

Inseong Hwang, Jihoon Jang, Hyun Kim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper analyzes the detailed measurement of memory bottlenecks in processing-in-memory (PIM) systems on deep neural networks (DNNs) with two precisions (INT8/FP32) by utilizing the memory bottleneck metrics. The impact of INT8, a helpful data movement efficiency improvement on DNN, was examined to determine which precision is more optimal for a PIM system. The results demonstrate that INT8 alleviates the overall memory bottleneck, while LLC MPKI of Softmax with high computational complexity increases from 3.459 to 16.725 and LFMR of the FC layer decreases from 99.795% to 99.483%, but it is hard to expect considerable improvement. For this reason, processing the Softmax and FC layers in PIM when targeting INT8 DNN models is anticipated to enhance performance significantly.

Original languageEnglish
Title of host publication2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331510756
DOIs
StatePublished - 2025
Event2025 International Conference on Electronics, Information, and Communication, ICEIC 2025 - Osaka, Japan
Duration: 19 Jan 202522 Jan 2025

Publication series

Name2025 International Conference on Electronics, Information, and Communication, ICEIC 2025

Conference

Conference2025 International Conference on Electronics, Information, and Communication, ICEIC 2025
Country/TerritoryJapan
CityOsaka
Period19/01/2522/01/25

Keywords

  • Deep Neural Network
  • Memory Bottleneck Analysis
  • Multi-Precision
  • Processing-In-Memory

Fingerprint

Dive into the research topics of 'Optimizing Deep Neural Network Precision for Processing-in-Memory: A Memory Bottleneck Perspective'. Together they form a unique fingerprint.

Cite this