ORCID Identifier(s)

0000-0002-0882-2434

Graduation Semester and Year

Fall 2024

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science

Department

Computer Science and Engineering

First Advisor

Mohammad Atiqul Islam

Second Advisor

Leonidas Fegaras

Third Advisor

Jia Rao

Fourth Advisor

William Beksi

Abstract

As user-interactive applications in the cloud transition from monolithic services to agile microservice architectures, efficient resource management becomes a key challenge. The multitude of loosely coupled components and fluctuating traffic patterns make traditional cloud autoscaling methods ineffective. Existing machine learning-based approaches, while attempting to address this, often require extensive training data and can lead to intentional violations of service level objectives (SLOs). To tackle these challenges, I propose PEMA (Practical Efficient Microservice Autoscaling), a lightweight resource manager for microservices. PEMA aims to optimize resource allocation through opportunistic resource reduction, considering the intricate dependencies between microservices.

On another front, scientific workflows are evolving to accommodate the increasing diversity and parallelism of modern computing systems. The integration of multi-scale simulations with Artificial Intelligence and Machine Learning (AI/ML) methods has made interdisciplinary workflows increasingly complex and challenging to manage using traditional high-performance computing (HPC) infrastructure. Converged computing, a growing movement that integrates HPC and cloud technologies into a seamless environment, can provide a means to bridge the gap between the needs and capabilities of modern scientific workflows. Ensemble-based HPC workflows, particularly those leveraging MPI-based (Message Passing Interface) workflows, stand to benefit from the efficiency improvements enabled by cloud-native orchestration. While these workflows have been demonstrated to scale in Kubernetes, limited work has explored the combined impact of autoscaling and elasticity on MPI-based workflows. To address this, we leveraged the Flux Operator, a Kubernetes operator of the Flux framework, and developed a workload-driven autoscaling strategy that outperforms traditional CPU utilization-based autoscaling for MPI-based ensembles. This approach enhances efficiency and reduces ensemble completion time by up to 4.7× compared to CPU utilization-based methods.

Additionally, significant power consumption remains a critical challenge for current and future HPC systems. Despite this, HPC systems often remain power underutilized, making them ideal candidates for power oversubscription to reclaim unused capacity. To mitigate the risk of system overload during oversubscription, I propose MPR (Market-based Power Reduction), a scalable, market-driven approach that incentivizes HPC users to reduce power consumption during overloads in exchange for rewards. Real-world trace-based simulations show that MPR consistently benefits both users and HPC managers by balancing resource gain and performance loss. We also demonstrate the effectiveness of MPR on a prototype system, highlighting its potential as a sustainable power management solution.

Keywords

Resource Management, High Performance Computing, Microservices, Autoscaling, Cloud Computing, Converged Computing

Disciplines

Databases and Information Systems | Systems Architecture

Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.