DESTINY: A Comprehensive Tool with 3D andMulti-level Cell Memory Modeling CapabilitySparsh Mittal, Rujia Wang, Jeffrey VetterTo cite this version:Sparsh Mittal, Rujia Wang, Jeffrey Vetter. DESTINY: A Comprehensive Tool with 3D and Multi-levelCell Memory Modeling Capability. Journal of Low Power Electronics and Applications, MDPI, 2017,7 (3), pp.23. 10.3390/jlpea7030023 . hal-01609132 HAL Id: 1609132Submitted on 3 Oct 2017HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

1DESTINY: A Comprehensive Tool with 3D andMulti-level Cell Memory Modeling CapabilitySparsh Mittal and Rujia Wang and Jeffrey VetterAbstractTo enable the design of large capacity memory structures, novel memory technologies such as non-volatilememory (NVM) and novel fabrication approaches, e.g., 3D stacking and multi-level cell (MLC) design have beenexplored. The existing modeling tools, however, cover only few memory technologies, technology nodes andfabrication approaches. We present DESTINY, a tool for modeling 2D/3D memories designed using SRAM, resistiveRAM (ReRAM), spin transfer torque RAM (STT-RAM), phase change RAM (PCM) and embedded DRAM (eDRAM)and 2D memories designed using spin orbit torque RAM (SOT-RAM), domain wall memory (DWM) and Flash memory.In addition to single-level cell (SLC) designs for all these memories, DESTINY also supports modeling MLC designsfor NVMs. We have extensively validated DESTINY against commercial and research prototypes of these memories.DESTINY is very useful for performing design-space exploration across several dimensions, such as optimizing fora target (e.g. latency, area or energy-delay product) for a given memory technology, choosing the suitable memorytechnology or fabrication method (i.e. 2D v/s 3D) for a given optimization target, etc. We believe that DESTINY willboost studies of next-generation memory architectures used in systems ranging from mobile devices to extreme-scalesupercomputers.Index TermsCache, SRAM, eDRAM, non-volatile memory (NVM or NVRAM), STT-RAM, ReRAM, PCM, SOT-RAM, DRAM,DWM, Flash , open-source, modeling tool, emerging memory technologiesF1INTRODUCTIONAs processor core-count rises and key applications become more data-intensive, the memory requirementsof modern computing systems are growing tremendously. To cater to these needs, modern processors areusing large-sized memory structures, e.g., last level cache, main memory and storage. For example, Intel’s22 nm Haswell processor employs 128 MB eDRAM LLC (last level cache) [1], and the 22-nm Xeon E5-2600processor has 45 MB SRAM LLC [2]. To address these challenges, researchers are exploring novel memorytechnologies, fabrication schemes and higher-density cell-designs. For example, eDRAM, STT-RAM, ReRAM,PCM, DWM, SOT-RAM and (NAND) Flash memory have received significant attention in recent years [3](unless otherwise mentioned, in this paper, Flash refers to NAND Flash). Similarly, 3D integration technologyhas been explored for achieving higher bandwidth, higher flexibility in routing signals, power, clock andability to integrate diverse memory technologies for designing hybrid memory designs [4, 5]. Finally, sinceNVMs have a large resistance margin between set and reset states, MLC designs have been studied, whichstore multiple (e.g., two) bits in each cell to achieve higher density than the SLC designs.Architectural exploration and system integration of these technologies and design approaches cruciallydepend on the availability of comprehensive, open-source and validated modeling tools. Existing modelingtools, however, fail to meet these requirements. Tools such as CACTI(3DD) [6, 7] and NVSim [8] only modela few memory technologies. Researchers typically use CACTI for modeling SRAM/DRAM and NVSimfor modeling NVMs (e.g., [9]); however, these tools use different modeling framework, assumptions andinput/output formats. For example, NVSim provides the output in the form of hit/miss/write latency/energy,while CACTI provides the output in the form of access time and random cycle time. Similarly, NVSim providesthe ability to find a configuration optimized for a certain target (e.g., area, leakage, etc.), while CACTI doesnot do so. Each of these input parameters can have marked influence on the output obtained from the tool S. Mittal is with IIT Hyderabad, India. Email: [email protected] Address: E-621, IIT Hyderabad, Kandi, Telangana, India 502285. R. Wangis with University of Pittsburgh, USA. J. Vetter is with Oak Ridge National Lab, USA. Support for this work was provided by Science andEngineering Research Board (SERB), India, award number ECR/2017/000622.

2[8, 10, 11]. Further, due to differences in modeling frameworks, the outputs of these tools can be different evenfor the same input configuration. An example of this is shown in Table 1. It is clear that both the output valuesand output format of each tool are different. This makes it difficult to have one-to-one correspondence betweentheir inputs/outputs. Thus, the lack of comprehensive tools may force researchers to compare estimates fromdifferent tools or restrict their choices to only a few memory technologies. These, however, may lead tosuboptimal or even incorrect conclusions.TABLE 1CACTI and NVSim results for the same cache configuration: 32-nm, 64 B block, 16-way 4 MB SRAM cache.CACTIArea: 14.90 mm2Leakage: 0.574 WAccess and random cycle time: 0.634 and 3.119 nsRead dynamic energy: 0.182 nJNVSimArea: 6.75 mm2Leakage: 0.395 WHit/miss/write latency: 2.009, 0.314 and 1.079 nsHit/miss/write dynamic energy: 0.388, 0.032, 0.363 nJSome researchers use in-house modeling tools (e.g., [12]); however, experiments conducted with such toolsmay not be reproducible, and their accuracy may not be verified. Furthermore, in absence of a 3D modelingtool, some studies (e.g., [12]) derive parameters for 3D memories using a linear extrapolation of 2D parameters,which may be inaccurate. Finally, some tools such as 3DCacti have not been updated for recent feature sizes(e.g., sub-45 nm). Clearly, the current state-of-the-art calls for an open-source, comprehensive, validated andup-to-date tool for allowing full design-space exploration of memory technologies and design trends.Contributions: In this paper, we present DESTINY, a 3D design-space exploration tool for SRAM, eDRAMand non-volatile memory. DESTINY utilizes the 2D SLC circuit-level modeling framework of NVSim tool forSRAM, STTRAM, PCM, ReRAM and Flash. It also utilizes the coarse- and fine-grained TSV (through siliconvia) models from the CACTI-3DD tool. Further, DESTINY adds the model of eDRAM (Section 4.1), SOT-RAM,DWM, the capability to model MLC designs (Section 4.4) and two additional types of 3D designs (Section 4.5).Overall, DESTINY enables modeling of both 2D/3D designs of SRAM, eDRAM, STT-RAM, PCM and ReRAMand 2D designs of SOT-RAM, DWM and Flash memory. In addition to SLC models for all memories, DESTINYcan also model MLC models of NVMs. Thus, DESTINY can model both volatile and non-volatile memoriesand both conventional and emerging memories. Furthermore, it models technology nodes ranging from 22nm–180 nm. Table 2 summarizes the capabilities of DESTINY and compares them with those of CACTI andNVSim.TABLE 2An overview of the modeling capabilities of some open-source D72D/3DPCM7STT-RAM7ReRAM72D, SLC2D/3D, SLC/MLCSOT-RAM7Flash7DWM772D, SLC/MLCWe have compared the results obtained from DESTINY against several commercial and research prototypes[4, 5, 11, 13–24] to validate the newly-added memories, MLC and 3D models in DESTINY (Section 5). Themodeling error has been observed to be less than 10% for most cases and less than 25% for all cases. Thiscan be accepted as reasonable for an academic modeling tool and is also in range with the errors producedby previous tools [8].DESTINY facilitates exploring a large design space, which provides important insights and is also useful forearly stage estimation of emerging memory technologies (Section 6). Further, by virtue of being an open-sourcetool, it facilitates reproducible research and easy extension of the tool for many more usage scenarios than thosediscussed in the paper. For example, apart from modeling standard caches, DESTINY can also model assiststructures (e.g., victim cache, write-buffer, tag-only caches) and the translational look aside buffer (TLB) [25]designed with different memory technologies. We believe that DESTINY will be a useful tool in architectureand system-level studies and will assist researchers, designers and technical professionals.This paper extends the previous version [26] in several significant ways:1) We have presented the motivation behind the development of DESTINY by discussing the design trendsin modern processors and the limitations of existing modeling tools (Section 2).2) We have discussed the device-level data storage mechanism of each the memory technology (Section 3).3) We have now added support for modeling new memories (DWM, SOT-RAM) and MLC designs for allNVMs (including Flash). We have discussed their modeling framework and validation (Sections 4.2 to 4.4and 5.1 to 5.7).

34) We have now shown the use of DESTINY in performing design-space exploration, for example finding theoptimal memory technology for a given optimization target (Section 6.1), finding the optimal number of3D layers for a given optimization target (Section 6.2), modeling assist structures (Section 6.3), etc.5) We have discussed the usefulness of DESTINY in gaining insights for designing management policies formemory structures such as cache, the register file, etc., using different memory technologies (Section 6.4).22.1MOTIVATION AND RELATED WORKMotivation behind the Design of DESTINYTo meet the challenges of rising core-count and data-intensive applications, modern CPUs and GPUs featureincreasingly larger storage structures. For example, IBM’s 45-nm Power7 processor had a 32 MB LLC [27]; the32-nm Power7 processor had an 80 MB LLC [28]; and the 22-nm Power8 processor had a 96 MB LLC [29].LLC size in GPUs is also on the rise [30]. Similarly, the total size of the GPU register file has increased from512 KB on G80 (Tesla) and 2048 KB on GF100 (Fermi) to 7680 KB on GK210 (Kepler) and 14,336 KB on GP100(Pascal) [31]. It is clear that high-density memory technologies (e.g., NVMs), cell designs (e.g., MLC) andfabrication approaches (e.g., 3D) will be essential to meet the rising memory demands in future computingsystems.Further, given that different memory structures (e.g., register file, shared memory, first and last levelcaches, main memory) in different processors (CPUs or GPUs) need to be optimized for distinct points inthe latency/energy/area spectrum, a comprehensive modeling tool is definitely required that allows completedesign-space exploration over memory technologies, design approaches and optimization targets. DESTINYis intended to fulfill this need and also to boost architectural studies of next-generation memory systems.2.2A Comparison of Modeling ToolsResearchers have proposed several tools for modeling and estimating the energy consumption, the performanceof processors or their specific components. A few existing tools provide modeling capability individually fordifferent memory technologies, such as SRAM, DRAM, eDRAM and NVMs. CACTI [7] simulates SRAM cachesand has been extended to support eDRAM and DRAM. Furthermore, several improvements have been madeto CACTI to improve its modeling capability/accuracy. Mamidipaka et al. [32] proposed eCACTI, whichadds a leakage model into CACTI, and Li et al. [33] proposed CACTI-P, which models low-power caches(e.g., cache with sleep transistors). Chen et al. [6] presented CACTI-3DD, which adds a TSV model for DRAMmemory; however, this tool is designed for DRAM, and hence, does not allow accurate modeling of 3D SRAMcaches. 3DCacti provides the ability to model 3D SRAM; however, this tool has not been updated to supporttechnology nodes below 45 nm. None of these tools model emerging NVMs. NVSim provides 2D modelingof SRAM, ReRAM, STT-RAM, PCM and SLC NAND Flash. NVSim has not been validated for SOT-RAM, andit does not provide a configuration file for modeling it.None of these tools provide the comprehensive modeling and design space exploration capability asprovided by DESTINY. Existing tools also do not provide the capability to model DWM and MLC, etc. As anincreasing number of industrial designs utilizes 3D stacking [4, 5], research on 3D stacking has become veryimportant. However, existing 3D modeling tools such as CACTI-3DD [6] and 3DCacti do not model NVMs.Another challenge in using multiple tools is that over time, these tools undergo revisions (due to more accuratemodeling, bug fixes, etc.) which makes the task of cross-comparison even more difficult. Clearly, DESTINYoffers distinct advantages over existing tools, and by virtue of its comprehensive modeling capability, it canbe a very useful design space exploration and decision-support tool.33.1A BACKGROUND ON MEMORY TECHNOLOGIES AND MLC DESIGNData Storage Mechanism of Memory TechnologiesWe now briefly discuss the data storage mechanism of different memory technologies. For more details anddiscussion of related issues, we refer the reader to previous work [8, 13, 34–36].eDRAM: In eDRAM, the data are stored as charge in a capacitor, which is either a deep-trench capacitoror stacked capacitor between metal wi