Computer System Modeling and Analysis

Members: Joonsung Kim, Gyuhyeon Lee, Seungho Lee

Motivation

A major challenge of architectural research and development is to analyze the system’s performance and energy efficiency, identify its bottlenecks, and find the best architecture design to resolve the bottlenecks as soon as possible in the most cost-effective way. In addition, computer architects need fast and accurate system modeling and analysis tools. Therefore, we develop and provide various state-of-the-art computer system modeling and analysis tools in this project.

Research

CPU [MICRO’14, MICRO’18, TACO’18]. A CPU directly affects the system’s overall performance and energy efficiency. However, it is extremely difficult to accurately analyze its performance nor accelerate its timing simulation because a CPU has an increasing number of cores, and each core gets more complexed. To resolve these limitations, we develop and provide various  CPU modeling and simulation methods which combine various methods (e.g., analytic modeling, directed performance test, machine learning, simulator parallelization). For example, DiagSim, our processor performance analysis tool, can extract the target commodity processor’s detailed timing behaviors using microbenchmarks. RpStacks, our hybrid method to combine analytic modeling and simulator parallelization methods, can greatly accelerate the processor design-space exploration speed.

Peripheral devices [MICRO’18]. Modern high-performance servers are now equipped with emerging peripheral devices (e.g., GPU, SSD, NIC, accelerators), and the peripheral devices’ impacts to the system’s overall performance and efficiency have been significantly increased. Therefore, computer architecture modeling tools must incorporate those devices, but it is extremely difficult because the detailed mechanisms of the peripheral devices are undisclosed and vary among the vendors. To resolve the issue, we develop a novel performance modeling tool which can accurately extract the devices’ internal microarchitectures and their timing behaviors, and show its real-world use cases. For example, SSDcheck, our SSD analysis tool, can predict the latency of its future accesses and improve the efficiency of I/O scheduler at no cost. We are currently extending our modeling tool’s coverage to support a wider spectrum of devices (e.g, GPU, NIC, accelerators).

Server/Datacenter [ISPASS’17, ASPLOS’18]. A modern datacenter consists of an extremely large number of servers and runs various combinations of live-production workloads. The datacenter’s large scale and ever-changing workload colocation make their performance modeling extremely difficult. For example, a datacenter manager cannot accurately estimate the performance impact of upgrading certain hardware components without actually conducting the datacenter-wide hardware upgrade. To resolve the issue, we provide WSMeter, our cost-effective datacenter performance modeling tool, by extracting a small set of servers and workloads which accurately represent the whole datacenter. WSMeter has been validated and applied at Google. In addition, we provide StressRight, our datacenter workload development and tuning method, to quickly evaluate the datacenter performance with a properly-configured datacenter workload.

Software release

Publications

  • RpStacks-MT: A High-throughput Multi-core Processor Design Evaluation Methodology
    Hanhwi Jang, Jae-Eon Jo, Jaewon Lee, and Jangwoo Kim
    ACM/IEEE International Symposium on Microarchitecture (MICRO), Oct. 2018
  • SSDcheck: Timely and Accurate Prediction of Irregular Behaviors in Black-Box SSDs
    Joonsung Kim, Pyeongsu Park, Jaehyung Ahn, Jihun Kim, Jong Kim, and Jangwoo Kim
    ACM/IEEE International Symposium on Microarchitecture (MICRO), Oct. 2018
  • DiagSim: Systematically Diagnosing Simulators for Healthy Simulations
    Jae-Eon Jo, Gyu-Hyeon Lee, Hanhwi Jang, Jaewon Lee, Mohammadamin Ajdari, and Jangwoo Kim
    ACM Transactions on Architecture and Code Optimization (TACO), vol. 15, Apr. 2018
  • WSMeter: A Performance Evaluation Methodology for Google’s Production Warehouse-Scale Computers
    Jaewon Lee, Changkyu Kim, Kun Lin, Liqun Cheng, Rama Govindaraju, and Jangwoo Kim
    ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Mar. 2018
  • StressRight: Finding the Right Stress for Accurate In-development System Evaluation
    Jaewon Lee, Hanhwi Jang, Jae-eon Jo, Gyu-hyeon Lee, and Jangwoo Kim
    IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Apr. 2017
  • RpStacks: Fast and Accurate Processor Design Space Exploration Using Representative Stall-Event Stacks
    Jaewon Lee*, Hanhwi Jang*, and Jangwoo Kim
    ACM/IEEE International Symposium on Microarchitecture (MICRO), Dec 2014

* Contributed equally