Graduation Semester and Year

Spring 2026

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Jeff Lei

Second Advisor

Faysal Hossain Shezan

Third Advisor

Christoph Csallner

Abstract

Anonymized summarization aims to produce useful summaries while replacing person names with placeholders such as [PERSON]. In practice, privacy is often evaluated by checking that no real names appear in the generated text. However, suppressing a name in the output does not guarantee that the model has removed identity information from its internal computation. This gap is critical in collaborative inference frameworks, where intermediate internal states (ISs) are exposed for safety auditing. We study this question in a controlled setting where a language model reads a document containing a real name and is required to generate a summary that replaces that name with [PERSON]. We ask two questions: does the model still internally encode the true identity at the point where anonymization occurs, and does generating the placeholder actually depend on that identity information? Across two instruction-tuned 7–8B decoder-only language models evaluated on CNN/DailyMail, we find that identity remains strongly and linearly recoverable from hidden representations during anonymized summarization. Yet removing identity-related signal from the internal state used for prediction does not impair the model’s ability to generate [PERSON]. When the model is instead required to produce the real name, the same intervention causes substantial increase in generation loss, confirming that the method detects true dependence. These results show that output-level anonymization can hide names in text while leaving identity information intact inside the model, revealing a privacy risk that output inspection alone cannot detect.

Keywords

AI/ML Interpretability, AI/ML Security and Privacy, Natural Language Processing, Large Language Models Privacy

Disciplines

Computational Engineering

License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Recommended Citation

Begum, Samreen, "MECHANISTIC AUDITING OF PRIVACY RETENTION IN LLM SUMMARIZATION" (2026). Computer Science and Engineering Theses - Archive. 543.
https://mavmatrix.uta.edu/cse_theses/543

Computer Science and Engineering Theses - Archive

MECHANISTIC AUDITING OF PRIVACY RETENTION IN LLM SUMMARIZATION

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Search

Browse

Author & Creator Corner

Links

Computer Science and Engineering Theses - Archive

MECHANISTIC AUDITING OF PRIVACY RETENTION IN LLM SUMMARIZATION

Author

Graduation Semester and Year

Language

Document Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Keywords

Disciplines

License

Recommended Citation

Included in

Share

Search

Browse

Author & Creator Corner

Links