- [2024/11] Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents
- [2024/09] Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey
- [2024/08] Attacks and Defenses for Generative Diffusion Models: A Comprehensive Survey
- [2024/07] Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
- [2024/07] AI Safety in Generative AI Large Language Models: A Survey
- [2024/07] The Emerged Security and Privacy of LLM Agent: A Survey with Case Studies
- [2024/07] Operationalizing a Threat Model for Red-Teaming Large Language Models (LLMs)
- [2024/06] Unique Security and Privacy Threats of Large Language Model: A Comprehensive Survey
- [2024/06] Safeguarding Large Language Models: A Survey
- [2024/06] Exploring Vulnerabilities and Protections in Large Language Models: A Survey
- [2024/03] Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
- [2024/03] Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices
- [2024/03] Breaking Down the Defenses: A Comparative Survey of Attacks on Large Language Models
- [2024/03] On Protecting the Data Privacy of Large Language Models (LLMs): A Survey
- [2024/02] Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey
- [2024/02] Safety of Multimodal Large Language Models on Images and Text
- [2024/01] Security and Privacy Challenges of Large Language Models: A Survey
- [2024/01] Black-Box Access is Insufficient for Rigorous AI Audits
- [2024/01] Red Teaming Visual Language Models
- [2024/01] R-Judge: Benchmarking Safety Risk Awareness for LLM Agents
- [2024/01] Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems
- [2024/01] TrustLLM: Trustworthiness in Large Language Models
- [2023/12] Privacy Issues in Large Language Models: A Survey
- [2023/12] A Survey on Large Language Model (LLM) Security and Privacy: The Good, the Bad, and the Ugly
- [2023/09] AgentBench: Evaluating LLMs as Agents
- [2023/09] Identifying the Risks of LM Agents with an LM-Emulated Sandbox
- [2023/07] A Comprehensive Overview of Large Language Models
- [2023/06] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
- [2023/04] Safety Assessment of Chinese Large Language Models
- [2023/03] A Survey of Large Language Models
- [2022/11] Holistic Evaluation of Language Models
- [2022/08] Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
- [2022/06] Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models