Cornell University
View map Free Event

Software and Hardware Co-design for Scalable and Energy-efficient Neural Network Training with Processing-in-Memory

Abstract: Neural networks (NNs) have been adopted in a wide range of application domains, such as image classification, speech recognition, object   detection, and computer vision. However, training NNs especially deep neural networks (DNNs) can be energy and time consuming, because of frequent data movement between  processor  and  memory. Furthermore,  training  involves massive  fine-grained  operations  with  various  computation  and memory  access  characteristics. Exploiting high  parallelism  with such diverse operations is challenging. In this talk, I will describe our effort on a  software/hardware co-design of heterogeneous processing-in-memory (PIM) system. Our hardware design incorporates hundreds of fix-function arithmetic units and a programmable core on the logic layer of a 3D die-stacked memory to form a heterogeneous PIM architecture  attached to CPU. Our software design offers a programming model and a runtime system that program, offload, and schedule various NN training operations across  compute resources provided by CPU and heterogeneous PIM. By extending the OpenCL programming model and employing a hardware heterogeneity-aware runtime system, we enable high program portability and easy program maintenance across  various heterogeneous hardware, optimize system  energy efficiency, and improve hardware utilization. Furthermore, DNN training can require terabytes of memory capacity. In order to tackle the memory capacity challenge, we propose a scalable and elastic memory fabric architecture, which consists of (i) a random topology and greedy routing protocol that can efficiently interconnect up to a thousand 3D memory stacks and (ii) a reconfiguration scheme that enables elastic memory fabric scale. In addition to scalability and flexibility, our memory fabric design can also improve both system throughput and energy consumption compared with traditional memory fabric designs.

Bio: Jishen Zhao is an Assistant Professor in the Computer Science and Engineering Department at University of California, San Diego. Her research spans and stretches the boundary between computer architecture and system software, with a particular emphasis on memory and storage systems, domain-specific acceleration, and system reliability. Her research is driven by both emerging device/circuit technologies (e.g., 3D integration, nonvolatile memories) and modern applications (e.g., big-data analytics, machine learning, and smart home and transportation). Before joining UCSD, she was an Assistant Professor at UC Santa Cruz, and a research scientist at HP Labs before joining UCSC.

 

0 people are interested in this event

User Activity

No recent activity