Graduation Semester and Year

Spring 2024



Document Type


Degree Name

Doctor of Philosophy in Computer Science


Computer Science and Engineering

First Advisor

Shirin Nilizadeh


Social media has become a powerful tool that reflects human communication's best and worst aspects. They allow individuals to freely express opinions, communicate with others, and learn about new stories. On the other hand, they have become fertile grounds for several forms of abuse, harassment, and the dissemination of misinformation. Social media platforms have established and employed content moderation to counteract the spread of abuse and misinformation.

Some critical challenges hinder the understanding of the social media content moderation ecosystem. This dissertation investigates various aspects of content moderation, including their coverage, fairness, and effectiveness. Firstly, it investigates how, in practice, different social media platforms moderate content to detect hate speech and misinformation. Secondly, it investigates how these platforms do not moderate all types of content, e.g., misinformation about technologies. Thirdly, it proposes methods for auditing the fairness of black-box algorithms used for content moderation. Fourthly, it proposes methods for examining the effects of changes to content moderation policies on the platform's content and users' discourse. Finally, it investigates how users (mis)use content moderation reporting mechanisms in socio-political movements.

The outcome and critical findings of this dissertation are as follows: Firstly, this dissertation evaluates, systematizes, and contextualizes existing knowledge about social media content moderation. It provides two novel frameworks for detecting misinformation about \emph{phishing} websites and security and privacy concerns related to \emph{Zoom}. Auditing the fairness and bias of Yelp's review recommendation and ranking system revealed that the system disproportionately filters reviews written by new users as not recommended, and restaurants in hotspots have higher exposure, hence a better ranking in the search results. Using Difference-in-Difference (DiD) series analysis to study Parler's moderation changes indicated that users' hateful and harmful content decreased significantly. Using large-scale data from Twitter, we found that malicious users misuse the existing content moderation functionalities to suppress and restrict users, inflating fake hashtags and enticing users into malicious hacking schemes.

Overall, this dissertation provides a first-ever comprehensive overview of the current social media content moderation research, how content moderation guidelines are enforced by different platforms, and the challenges still present in their implementation. This dissertation contributes to the existing knowledge on content moderation while still providing the community with novel frameworks for identifying misinformation about security and privacy risks, auditing black box content moderation and ranking systems, investigating the effectiveness of stricter content moderation, and identifying the (mis)use of content moderation functionalities. Additionally, this dissertation puts forward important challenges and provides solutions for effective content moderation. Future scholarships can use this dissertation to investigate and propose solutions for the identified gaps.


Content Moderation, Social Media, Misinformation, Security and Privacy, Fairness and Bias, First Amendment, Section 230, Yelp, Parler, Twitter, Facebook, Reddit, Instagram, Ranking Systems, Black-Box Moderation


Artificial Intelligence and Robotics | Computer Law | Computer Sciences | Databases and Information Systems | Data Science | First Amendment | Information Security | Internet Law | Law | Social and Behavioral Sciences | Social Media


Creative Commons Attribution 4.0 International License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Available for download on Wednesday, May 28, 2025