ORCID Identifier(s)

0000-0001-6444-2623

Graduation Semester and Year

2020

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Shirin Nilizadeh

Abstract

Phishing websites are one of the most pervasive online attack vectors, with nearly 1.5 million such attacks created every month. Social media is the primary ground for phishing attacks, with 86% of these attacks originating from Twitter, Facebook, LinkedIn, etc. Prevalent approaches against these attacks includes URL scanners, anti-phishing blacklists and social media's own detection systems. In this work, we focus on Twitter, and through a combination of data-driven methods and emulations, we evaluate the verdicts provided by URL scanners, and Twitter’s detection system. We show that these sources provide a good amount of misinformation, which not only can lead users to visit malicious websites, but also can decrease the web traffic to legitimate websites. In particular, we analyzed 40k unique URLs obtained from 1 million tweets by grouping them into 5 different categories based on their characteristics, and found that Twitter is consistently unreliable at labelling URLs from 2 different categories - Phishing websites hosted under trusted domains, Benign URLs which are hosted under suspicious URL Shortening services or free Web Hosting domains. We also found that for accounts that continuously post malicious links, Twitter does not proactively suspend or remove them, instead relying heavily on users reporting these accounts before they are suspended. We also found about 71% of the URLs detected by URL Scanning engines as malicious were actually benign, which was caused due to three factors which frequently lead to false positives in these tools. These factors are URL domain bias, web hosting bias and reliance on PhishTank, a public URL blacklist. We also discovered a new form of phishing attack which leverages the use of popular web domains and further implements two obfuscation methods, to remain undetected for more than a month. Finally, we conducted an IRB approved survey study on Amazon Mechanical Turk, where we evaluated the impact of the misinformation, provided by both URL Scanning engines and Twitter, on the perception of users about the websites. We found that users have more confidence on certain URL features more than others, and in the later case, they heavily rely on the detection tools, irrespective of whether their verdict was right or not.

Keywords

Twitter, Security, Misinformation, Phishing, Anti-phishing, Phishtank, Virustotal, Websites, Social

Disciplines

Computer Sciences | Physical Sciences and Mathematics

License

This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.

Comments

Degree granted by The University of Texas at Arlington

Recommended Citation

Saha Roy, Sayak, ""How Good are They?" - A State of the Effectiveness of Anti-Phishing Tools on Twitter" (2020). Computer Science and Engineering Theses. 63.
https://mavmatrix.uta.edu/cse_theses/63

Download

Included in

Computer Sciences Commons

COinS

Computer Science and Engineering Theses

"How Good are They?" - A State of the Effectiveness of Anti-Phishing Tools on Twitter