Graduation Semester and Year
Fall 2025
Language
English
Document Type
Dissertation
Degree Name
Doctor of Philosophy in Computer Science
Department
Computer Science and Engineering
First Advisor
Chengkai Li
Second Advisor
Shirin Nilizadeh
Third Advisor
Jun Yang
Fourth Advisor
Kenny Zhu
Abstract
The modern digital information ecosystem is defined by the rapid, large-scale production and dissemination of factual claims across social media and online news platforms. While this ecosystem enables unprecedented access to information, it simultaneously exacerbates two intertwined challenges that threaten both human understanding and the reliability of AI-driven systems. First, the sheer volume, redundancy, and topical diversity of factual claims render manual organization and analysis infeasible, while existing automated methods lack the semantic granularity and interpretability required for meaningful exploration. Second, even when individual statements are factually correct, selective presentation of evidence (commonly known as cherry-picking) can distort narratives, mislead audiences, and introduce subtle informational bias into the data pipelines that increasingly train and inform large language models (LLMs). This dissertation addresses these challenges by examining both the utility of LLMs for structuring factual claims at scale and their vulnerability to selectively biased information.
The first major contribution of this dissertation is LLMTaxo, a novel, end-to-end framework for the automated construction of fine-grained hierarchical taxonomies of factual claims within a topic domain from social media data. LLMTaxo is designed to transform massive, noisy collections of user-generated content into structured semantic representations that support interpretability, navigation, and downstream analysis. The framework integrates multiple components: 1) check-worthy claim detection to filter irrelevant content, 2) semantic clustering to identify distinct claims and reduce redundancy, 3) and prompt-based topic generation using LLMs to assign claims to a three-level hierarchy of broad, medium, and detailed topics. To stabilize topic generation and mitigate uncontrolled label proliferation, LLMTaxo incorporates a human-in-the-loop process for creating learning examples and a seed taxonomy, enabling few-shot prompting that guides LLMs toward consistent, conceptually coherent outputs.
This dissertation formally defines a taxonomy not merely as a set of labels, but as a structured semantic scaffold that encodes a general-to-specific interpretive pathway. Each topic inherits contextual meaning from its position in the hierarchy, allowing identical labels to represent distinct concepts under different parents. By adopting a single-inheritance tree structure, the taxonomy ensures unambiguous semantic trajectories, facilitating consistent annotation, aggregation, and evaluation.
To rigorously assess taxonomy quality, this dissertation introduces new evaluation metrics. These metrics measure the taxonomy's clarity, hierarchical coherence, orthogonality, and completeness, as well as claim-topic alignment and the topic's granularity. Both automated and human-centered evaluation are adopted to reduce the evaluation bias. Extensive experiments are conducted on three large-scale social media datasets from X and Facebook, spanning diverse domains, including COVID-19 vaccines, climate change, and cybersecurity. Results demonstrate that LLMTaxo produces compact, interpretable, and semantically coherent taxonomies, reducing topic fragmentation by up to 99.5\% compared to prompting approaches without structural guidance. The evaluations show high reliability, with strong inter-annotator agreement, confirming that the generated taxonomies align well with human conceptual organization.
The fine-grained taxonomies produced by LLMTaxo enable new analytical capabilities that are difficult to achieve with coarse topic labels. This dissertation demonstrates how taxonomy-guided organization supports granular analysis of public truthfulness stances toward factual claims, revealing how user attitudes vary not only by claim veracity but also by specific topical subdomains. Building on this foundation, the dissertation presents TrustMap, an interactive system that integrates hierarchical claim organization with truthfulness stance detection and geospatial analysis to visualize how social media users across regions respond to true, false, and mixed claims. In addition, this dissertation also presents a work to analyze social media users' truthfulness stance across different topics under the climate change domain specifically. These applications illustrate how structured claim taxonomies can transform unstructured discourse into interpretable knowledge representations that support social sensing and misinformation research.
The second major contribution of this dissertation investigates cherry-picking as a critical yet understudied form of informational bias. Unlike outright misinformation, cherry-picking constructs misleading narratives through the selective inclusion of factually correct statements while omitting equally important counterevidence. This makes cherry-picking difficult to detect and particularly dangerous in the context of LLMs, which are trained on and conditioned by large corpora of online text. The dissertation proposes an importance-based computational methodology for detecting cherry-picked content in news articles by identifying missing but salient information necessary for a balanced presentation. A dedicated dataset is introduced to support the study of cherry-picking detection.
To further study the impact of cherry-picking on LLMs, the dissertation presents the first systematic investigation of how cherry-picked evidence influences the belief states of modern LLMs. Through controlled experimental designs, the work evaluates multiple state-of-the-art LLMs under varying evidence conditions, disentangling the effects of selective factual evidence from user stance. The results demonstrate that LLMs are consistently susceptible to informational bias introduced by cherry-picking, exhibiting significant shifts in belief when different factual evidence is provided. These findings reveal the LLMs' vulnerability to cherry-picked information, highlighting the risks of deploying LLMs in high-stakes reasoning environments without safeguards against selective truth.
Overall, this dissertation advances the state of the art in computational analysis of factual claims by contributing: 1) a novel framework for fine-grained taxonomy construction using LLMs, 2) rigorous evaluation metrics for assessing taxonomy quality and claim–topic alignment, 3) real-world analytical systems that leverage hierarchical claim organization to study public discourse, and 4) a novel empirical understanding of how selective evidence undermines LLM belief reliability. By unifying structural organization and bias analysis, this work provides both practical tools and conceptual insights for building more interpretable, reliable, and trustworthy AI systems in an increasingly complex information landscape.
Keywords
Taxonomy, Factual claim, Social media, LLM
Disciplines
Artificial Intelligence and Robotics | Data Science | Social Media
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Recommended Citation
Zhang, Haiqi, "EXAMINING THE UTILITY AND VULNERABILITY OF LARGE LANGUAGE MODELS FOR FACTUAL CLAIM ORGANIZATION AND VERIFICATION" (2025). Computer Science and Engineering Dissertations. 430.
https://mavmatrix.uta.edu/cse_dissertations/430
Included in
Artificial Intelligence and Robotics Commons, Data Science Commons, Social Media Commons