Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Computer Science


Computer Science and Engineering

First Advisor

Hao Che


Scale-out applications have emerged to be the predominant datacenter workloads. The request processing workflow for such a workload may consist of one or more stages with massive numbers of compute nodes for parallel data-intensive processing. As a classic model for the most essential building block of a workflow, the Fork-Join queuing network model is found to be notoriously hard to solve due to the involvement of task partitioning and merging with barrier synchronization. The work in this dissertation aims to develop approximation methods for the prediction of tail and mean latency for Fork-Join queuing networks in a high load region, where resource provisioning is desirable. Specifically, this dissertation makes the following contributions. First, we propose a simple prediction model for tail latency for a large class of Fork-Join queuing networks of practical importance in a high load region using only the mean and variance of response times for tasks in the Fork phase as input. We also generalize the model for the cases where each request processing includes only a partial number of processing units in the network. The prediction errors for the 99th percentile request latency are found to be consistently within 10% at the load of 90% for both model and measurement-based testing cases. This work thus establishes a link between the system-level request tail latency constraint, a.k.a. the tail-latency Service Level Objective (SLO), and the subsystem-level task performance requirements, facilitating explicitly tail-constrained resource provisioning at scale. Second, we propose an empirical approach for mean latency approximation for such systems based on the developed tail latency prediction model. The experimental results show that the approach gives accurate predictions for various testing cases when the system is at high loads. Finally, we put forward a framework for scaling analysis of scale-out applications. With the proposed solution for mean latency approximation, we are able to derive a closed-form solution for this framework that can be used for the exploration of scaling properties. A case study is given for the case where all the task service time distributions are exponential.


Scale-out applications, Fork-Join queuing networks, Tail latency, Resource provisioning, Performance analysis


Computer Sciences | Physical Sciences and Mathematics


Degree granted by The University of Texas at Arlington