- [2025/02] Medical Multimodal Model Stealing Attacks via Adversarial Domain Alignment
- [2025/01] Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
- [2024/10] Archilles' Heel in Semi-open LLMs: Hiding Bottom against Recovery Attacks
- [2024/09] Alignment-Aware Model Extraction Attacks on Large Language Models
- [2024/08] Pre-trained Encoder Inference: Revealing Upstream Encoders In Downstream Machine Learning Services
- [2024/04] TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge Deployment
- [2024/03] Logits of API-Protected LLMs Leak Proprietary Information
- [2024/03] Stealing Part of a Production Language Model
- [2024/02] Recovering the Pre-Fine-Tuning Weights of Generative Models
- [2023/12] Lion: Adversarial Distillation of Proprietary Large Language Models
- [2023/03] On Extracting Specialized Code Abilities from Large Language Models: A Feasibility Study
- [2023/03] Stealing the Decoding Algorithms of Language Models