Xiaolei Ren

Graduation Semester and Year




Document Type


Degree Name

Doctor of Philosophy in Computer Science


Computer Science and Engineering

First Advisor

Yu Lei

Second Advisor

Jiang Ming

Third Advisor

Hao Che

Fourth Advisor

Junzhou Huang


ABSTRACT: Binary diffing is a technique used to compare and identify differences or similarities in executable files without access to source code. The potential applications of binary diffing in various software security tasks, such as vulnerability search, code clone detection, and malware analysis, have generated a vast body of literature in recent years. One of the recurring themes in binary diffing research is the evaluation of its resilience against the impact of compiler optimization, which is the most common source of syntactic differences in binary code. Despite that most binary diffing tools claim that they are immune to compiler optimization, recent studies have highlighted the need for the research community to revisit this claim, particularly regarding non-default optimization settings and function inlining. In this study, we investigate the effect of peephole optimization on binary diffing analysis. Peephole optimization is a feature of mainstream compilers that allows local rewriting of the input program. It replaces instruction sequences within a window (i.e., peephole) with shorter, faster, or functionally equivalent instruction sequences. Our research reveals that peephole optimization primarily affects binary code differences at the intra-procedural level, which contradicts the assumptions made by basic-block-centric comparison approaches. We conducted systematic experiments using LLVM’s unit test suite. We also customized Alive2, an LLVM translation validation tool, to isolate the impact of peephole optimization from the overall optimization process. Our investigation determines the pervasiveness of peephole optimization in the resulting compiled code and explores its effects on current binary diffing techniques. The noticeable decline in performance highlights the importance of considering peephole optimization in the analysis and improvement of binary diffing methodologies. Therefore, our findings suggest that researchers and practitioners should consider the impact of peephole optimization when developing and evaluating binary diffing tools. Further research is necessary to address this challenge and improve the effectiveness of binary diffing in various software security tasks.


Compiler optimization, Binary code, Peephole optimization


Computer Sciences | Physical Sciences and Mathematics


Degree granted by The University of Texas at Arlington

Available for download on Friday, August 01, 2025