Graduation Semester and Year

2019

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Jia Rao

Abstract

As a critical component of resource management in multicore systems, fair schedulers in hypervisors and operating systems (OSes) must follow a simple invariant: guarantee that the computing resources such as CPU cycles are fairly allocated to each vCPU or thread. As simple as it may seem, we found this invariant is broken when parallel programs with blocking synchronization are colocated with CPU intensive programs in hypervisors such as Xen, KVM and OSes such as Linux CFS. On the other hand, schedulers in virtualized environment usually reside in two different layers: one is in the hypervisor which aims to schedule vCPU onto each pCPU and another is in the virtual machine to schedule the processes. Such design principle will impose an implicit scheduling gap between these two layers such that threads holding the lock or waiting for the lock in the virtual machine can be inadvertently descheduled by hypervisors. This behavior will cause the well known LHP and LWP problems which can seriously degrade the performance of parallel applications. While the cloud is believed to be an ideal platform for hosting parallel applications, its nature of multi-user sharing and resource over-commitment makes parallel performance often quite disappointing and unpredictable. Although many research works have identified the excessive synchronization delays such as LWP and LHP due to multi-tenant interferences as the culprit, there lacks a full understanding of the quantitative relationship between changes in synchronization and the overall performance loss. As performance modeling plays a fundamental role in designing traditional parallel systems, a systematic and quantitative study of parallel performance under cloud interferences would help improve the resource and power management in datacenters. This dissertation explores two fundamental questions towards the solutions for the scheduling unfairness and inefficiency in multicore systems: why does the schedulers ``unexpectedly'' show to be unfairness under the common belief that scheduling algorithms have been stable for many years and are already perfect? Why does the schedulers exhibit effectively regarding to scheduling the parallel applications in physical environment but perform badly in the cloud? The goal of this dissertation is to enable multicore systems to proactively anticipate and defend against the scheduling unfairness and inefficiency, rather than reacting to their manifestations and consequences. This dissertation presents three key principles of systems design and implementation for rethinking and redesigning the scheduling algorithms in multicore systems against the unfairness and inefficiency---preemptive multiple queue fair queuing, interference-resilient scheduling, and differential scheduling. This dissertation demonstrates that applying these principles can effectively defend scheduling unfairness and inefficiency in multicore systems. Furthermore, this dissertation also presents the corresponding techniques and tools support that can automatically and systematically apply these principles into existing multicore systems. Scheduling algorithms which originated from scheduling the packets in single-linked network were widely used in computer systems, however scheduling unfairness are unexpectedly manifested through scaling these algorithms from single core to multicore systems. Scheduling inefficiency are usually caused by the implicit semantic gap existing in the virtualized environment. Thus, this dissertation has modified the design of single-linked scheduling algorithm to make them to be fairness in face of the multiple-linked network and furthermore applied them into the multicore system scheduling to eliminate the unfairness. Instead of leaving the scheduling activities in two virtualized layers transparently for each other, this dissertation first characterized the performance of parallel applications under interference and then proposed methods to bridging the semantic gap in order to remove the scheduling inefficiency.

Keywords

Operating system, Computer architecture, Virtualization, Parallel computing

Disciplines

Computer Sciences | Physical Sciences and Mathematics

License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Comments

Degree granted by The University of Texas at Arlington

Recommended Citation

Zhao, Yong, "Understanding and Optimizing Parallel Performance in Multi-tenant Cloud" (2019). Computer Science and Engineering Dissertations. 243.
https://mavmatrix.uta.edu/cse_dissertations/243

Download

Included in

Computer Sciences Commons

COinS

Computer Science and Engineering Dissertations

Understanding and Optimizing Parallel Performance in Multi-tenant Cloud