Graduation Semester and Year

Spring 2026

Language

English

Document Type

Thesis

Degree Name

Master of Science in Computer Science

Department

Computer Science and Engineering

First Advisor

Jeff Lei

Second Advisor

Faysal Hossain Shezan

Third Advisor

Christoph Csallner

Abstract

Anonymized summarization aims to produce useful summaries while replacing person names with placeholders such as [PERSON]. In practice, privacy is often evaluated by checking that no real names appear in the generated text. However, suppressing a name in the output does not guarantee that the model has removed identity information from its internal computation. This gap is critical in collaborative inference frameworks, where intermediate internal states (ISs) are exposed for safety auditing. We study this question in a controlled setting where a language model reads a document containing a real name and is required to generate a summary that replaces that name with [PERSON]. We ask two questions: does the model still internally encode the true identity at the point where anonymization occurs, and does generating the placeholder actually depend on that identity information? Across two instruction-tuned 7–8B decoder-only language models evaluated on CNN/DailyMail, we find that identity remains strongly and linearly recoverable from hidden representations during anonymized summarization. Yet removing identity-related signal from the internal state used for prediction does not impair the model’s ability to generate [PERSON]. When the model is instead required to produce the real name, the same intervention causes substantial increase in generation loss, confirming that the method detects true dependence. These results show that output-level anonymization can hide names in text while leaving identity information intact inside the model, revealing a privacy risk that output inspection alone cannot detect.

Keywords

AI/ML Interpretability, AI/ML Security and Privacy, Natural Language Processing, Large Language Models Privacy

Disciplines

Computational Engineering

Available for download on Wednesday, October 21, 2026

Share

COinS