Xiangzhe's Homepage

View My GitHub Profile

Xiangzhe Xu

I’m Xiangzhe, a Ph.D. student at Purdue University, advised by Prof. Xiangyu Zhang. I obtained my B.Eng. degree from Nanjing University. My research interest focuses on combining machine learning techniques with binary program analysis. Specifically, I am working on recovering high-level information (e.g., program behaviors, variable names, types) from lower-level programs (e.g., stripped binary programs). I use program analysis to construct better features from programs (e.g., input/output values reflecting dynamic program behaviors, state machine reflecting input specifications); and enhance the performance of machine learning models with program analysis techniques (e.g., making binary program models more robust by preventing models emphasizing on binary instructions that are not import for program semantics, improving the performance of LLM on recovering variable names from binary programs by formulating name recovery as a type-inference like task).

Email: xzx@purdue.edu

Publications

PEM: Representing Binary Program Semantics for Similarity Analysis via A Probabilistic Execution Model, Xiangzhe Xu*, Zhou Xuan*, Shiwei Feng, Siyuan Cheng, Yapeng Ye, Qingkai Shi, Guanhong Tao, Le Yu, Zhuo Zhang, Xiangyu Zhang. The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE’23). PDF, Full-length version

Extracting Protocol Format as State Machine via Controlled Static Loop Analysis, Qingkai Shi, Xiangzhe Xu, Xiangyu Zhang. The USENIX Security Symposium (USENIX’23). PDF

LmPa: Improving Decompilation by Synergy of Large Language Model and Program Analysis, Xiangzhe Xu, Zhuo Zhang, Zian Su, Ziyang Huang, Shiwei Feng, Yapeng Ye, Nan Jiang, Danning Xie Siyuan Cheng, Lin Tan, Xiangyu Zhang. PDF

Improving Binary Code Similarity Transformer Models by Semantics-Driven Instruction Deemphasis, Xiangzhe Xu, Shiwei Feng, Yapeng Ye, Guangyu Shen, Zian Su, Siyuan Cheng, Guanhong Tao, Qingkai Shi, Zhuo Zhang, and Xiangyu Zhang. The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA’23). PDF

Automatic Generation and Validation of Instruction Encoders and Decoders, Xiangzhe Xu, Jinhua Wu, Yuting Wang*, Zhenguo Yin and Pengfei Li. The 33rd International Conference on Computer-Aided Verification (CAV’21). PDF

CompCertELF: Verified Separate Compilation of C Programs into ELF Object Files, Yuting Wang, Xiangzhe Xu, Pierre Wilke, Zhong Shao. The 2020 ACM International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’20). PDF

CPC: Automatically Classifying and Propagating Natural Language Comments via Program Analysis, Juan Zhai, Xiangzhe Xu, Yu Shi, Guanhong Tao, Minxue Pan, Shiqing Ma, Lei Xu, Weifeng Zhang, Lin Tan, Xiangyu Zhang Proceedings of the 42nd International Conference on Software Engineering (ICSE’20). PDF

Services

Review: ACM Transactions on Software Engineering and Methodology(TOSEM)

Artifact Evaluation: IEEE/ACM International Symposium on Code Generation and Optimization(CGO’24), ACM Conference on Computer and Communications Security (CCS’23)

The 42nd International Conference on Software Engineering(ICSE’20) Track Scheduling co-Chair