Graduation Semester and Year

2019

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

David Levine

Abstract

The ATLAS Experiment is one of the four major particle-detector experiments at the Large Hadron Collider at CERN (birthplace of the World Wide Web). The ATLAS was one of the LHC experiments that successfully demonstrated the discovery of the Higgs-Boson in July 2012. At the end of 2018, CERN data archiving on tape-based drives reached 330 PB. Through the Worldwide LHC Computing Grid (WLCG), a distributed computing infrastructure, the calibrated data out of the particle accelerator is split in chunks and distributed all around the world for analysis. The WLCG runs more than two million jobs per day. At peak rate of data capturing, 10 GB of data is transferred per second and might require immediate storage. The workflow management system known as PanDA (for Production and Distributed Analysis) handles the data analysis jobs for LHC’s ATLAS Experiment. The University of Texas Arlington hosts two compute and storage data-centers together known as the SouthWest Tier II to process the ATLAS data. SouthWest Tier II (SWT2) is a consortium between The University of Texas at Arlington and Oklahoma University. This thesis focuses on finding an efficient way to compensate and optimize the available hardware specification(s) with a caching mechanism. The Caching mechanism (called Xcache), which uses XROOTD system and ROOT Protocol is installed at one of the machines of the cluster. The machine acts as a File Caching Proxy Server (with respect to the network) which redirects incoming client requests for data files over to the Redirector at CPB (Chemistry-Physics-Building, UT Arlington), thereby, acting as a direct mode proxy. In this process, the machine caches data files into its storage space (106 TB), and can be reused by Caching (Disk Caching Storage). This research focuses on the adoption of Xcache into the cluster and finding the network dependencies, performance parameters of the cache (Cache Rotation using High and Low Watermark, Bytes Input and Bytes Output for monitoring the network). Therefore, a proxy caching mechanism (Xcache) used to address bandwidth and access-latency (reduced network traffic) is also used to optimize the storage servers.

Keywords

Cloud computing, High throughput computing

Disciplines

Computer Sciences | Physical Sciences and Mathematics

Comments

Degree granted by The University of Texas at Arlington

Share

COinS