Computer Science and Engineering Dissertations

Fair and Sustainable Machine Learning: A Holistic Approach to Data Quality, Efficiency, and Resource-Aware Training

Zahidur Rahim Talukder, University of Texas at ArlingtonFollow

ORCID Identifier(s)

ORCID 0000-0003-0930-3123

Graduation Semester and Year

Summer 2025

Language

English

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Engineering

Department

Computer Science and Engineering

First Advisor

Mohammad A. Islam

Second Advisor

Jia Rao

Third Advisor

William Beksi

Fourth Advisor

Jiang Ming

Abstract

The increasing reliance on distributed, privacy-sensitive data has driven the emergence of Federated Learning (FL) as a transformative paradigm for collaborative machine learning. By enabling multiple client devices to train a shared global model without transferring raw data, FL offers significant privacy advantages. However, real-world deployments of FL are constrained by critical challenges such as data heterogeneity, client unreliability, and hardware disparities. These factors lead to uneven model convergence, degraded global accuracy, and fairness issues that threaten FL's scalability and inclusivity in diverse environments.

This dissertation investigates these challenges and proposes three novel algorithmic frameworks to advance the state-of-the-art in robust and fair FL. First, we address the computational bottlenecks of existing robust aggregation methods. Prior approaches rely heavily on statistical analysis of client updates, making them unsuitable for resource-constrained servers or edge-computing environments. To overcome this, FedASL (Federated Learning with Auto-weighted Aggregation based on Standard Deviation of Training Loss) leverages only the local training loss reported by clients to dynamically reweight their contributions. This lightweight approach effectively mitigates the impact of adversarial or unreliable clients, achieving comparable or superior global model accuracy while reducing computational costs by an order of magnitude.

Second, we focus on optimizing resource usage in FL by minimizing unnecessary computation and communication from low-contributing clients. Existing methods either require server-side client profiling, which compromises anonymity|or force all clients to participate regardless of their data quality. To address this, FedSRC (Federated Learning with Self-Regulating Clients) introduces a client-side checkpoint mechanism combining local test loss and a Refined Heterogeneity Index (RHI). This allows clients to autonomously assess their participation, abstaining from rounds where their contribution may harm global convergence. Experiments across four datasets demonstrate that FedSRC achieves up to 30% reduction in communication costs and 55% reduction in computation costs, while preserving privacy and maintaining high model performance.

Finally, we address fairness in FL under hardware and model heterogeneity. Traditional fairness-aware FL approaches assume homogeneous model architectures and often degrade performance by constraining all devices to the capabilities of the weakest client. FairHetero introduces a hardware-sensitive fairness framework that applies layered reweighting to balance intra-group (data-level) and inter-group (hardware-level) disparities. This tunable framework reduces performance variance across clients, enabling equitable participation without sacrificing the benefits of stronger devices. Theoretical analyses and extensive experiments validate the efficacy of these frameworks. Collectively, they provide robust, fair, and resource-efficient solutions that pave the way for scalable, privacy-preserving federated learning in heterogeneous real-world settings.

Keywords

Federated Learning, Privacy, Fairness, Data Quality, Efficiency, Savings

Disciplines

Computer and Systems Architecture

License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Talukder, Zahidur Rahim, "Fair and Sustainable Machine Learning: A Holistic Approach to Data Quality, Efficiency, and Resource-Aware Training" (2025). Computer Science and Engineering Dissertations. 416.
https://mavmatrix.uta.edu/cse_dissertations/416

Download

Included in

Computer and Systems Architecture Commons

COinS

Computer Science and Engineering Dissertations

Fair and Sustainable Machine Learning: A Holistic Approach to Data Quality, Efficiency, and Resource-Aware Training

ORCID Identifier(s)

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Search

Browse

Author & Creator Corner

Links

Computer Science and Engineering Dissertations

Fair and Sustainable Machine Learning: A Holistic Approach to Data Quality, Efficiency, and Resource-Aware Training

Author

ORCID Identifier(s)

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Fourth Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Share

Search

Browse

Author & Creator Corner

Links