Graduation Semester and Year
Summer 2025
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Computer Engineering
Department
Computer Science and Engineering
First Advisor
Mohammad A. Islam
Second Advisor
Jia Rao
Third Advisor
William Beksi
Fourth Advisor
Jiang Ming
Abstract
The increasing reliance on distributed, privacy-sensitive data has driven the emergence of Federated Learning (FL) as a transformative paradigm for collaborative machine learning. By enabling multiple client devices to train a shared global model without transferring raw data, FL offers significant privacy advantages. However, real-world deployments of FL are constrained by critical challenges such as data heterogeneity, client unreliability, and hardware disparities. These factors lead to uneven model convergence, degraded global accuracy, and fairness issues that threaten FL's scalability and inclusivity in diverse environments.
This dissertation investigates these challenges and proposes three novel algorithmic frameworks to advance the state-of-the-art in robust and fair FL. First, we address the computational bottlenecks of existing robust aggregation methods. Prior approaches rely heavily on statistical analysis of client updates, making them unsuitable for resource-constrained servers or edge-computing environments. To overcome this, FedASL (Federated Learning with Auto-weighted Aggregation based on Standard Deviation of Training Loss) leverages only the local training loss reported by clients to dynamically reweight their contributions. This lightweight approach effectively mitigates the impact of adversarial or unreliable clients, achieving comparable or superior global model accuracy while reducing computational costs by an order of magnitude.
Second, we focus on optimizing resource usage in FL by minimizing unnecessary computation and communication from low-contributing clients. Existing methods either require server-side client profiling, which compromises anonymity|or force all clients to participate regardless of their data quality. To address this, FedSRC (Federated Learning with Self-Regulating Clients) introduces a client-side checkpoint mechanism combining local test loss and a Refined Heterogeneity Index (RHI). This allows clients to autonomously assess their participation, abstaining from rounds where their contribution may harm global convergence. Experiments across four datasets demonstrate that FedSRC achieves up to 30% reduction in communication costs and 55% reduction in computation costs, while preserving privacy and maintaining high model performance.
Finally, we address fairness in FL under hardware and model heterogeneity. Traditional fairness-aware FL approaches assume homogeneous model architectures and often degrade performance by constraining all devices to the capabilities of the weakest client. FairHetero introduces a hardware-sensitive fairness framework that applies layered reweighting to balance intra-group (data-level) and inter-group (hardware-level) disparities. This tunable framework reduces performance variance across clients, enabling equitable participation without sacrificing the benefits of stronger devices. Theoretical analyses and extensive experiments validate the efficacy of these frameworks. Collectively, they provide robust, fair, and resource-efficient solutions that pave the way for scalable, privacy-preserving federated learning in heterogeneous real-world settings.
Keywords
Federated Learning, Privacy, Fairness, Data Quality, Efficiency, Savings
Disciplines
Computer and Systems Architecture
License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Recommended Citation
Talukder, Zahidur Rahim, "Fair and Sustainable Machine Learning: A Holistic Approach to Data Quality, Efficiency, and Resource-Aware Training" (2025). Computer Science and Engineering Dissertations. 416.
https://mavmatrix.uta.edu/cse_dissertations/416