Content area
Full text
Abstract: This paper presents an exploratory systematic analysis of prompt injection vulnerabilities across 36 diverse large language models (LLMs), revealing significant security concerns in these widely adopted AI tools. Prompt injection attacks, which involve crafting inputs to manipulate LLM outputs, pose risks such as unauthorized access, data leaks, and misinformation. Through 144 tests with four tailored prompt injections, we found that 56% of attempts successfully bypassed LLM safeguards, with vulnerability rates ranging from 53% to 61% across different prompt designs. Notably, 28% of tested LLMs were susceptible to all four prompts, indicating a critical lack of robustness. Our findings show that model size and architecture significantly influence susceptibility, with smaller models generally more prone to attacks. Statistical methods, including random forest feature analysis and logistic regression, revealed that model parameters play a primary role in vulnerability, though LLM type also contributes. Clustering analysis further identified distinct vulnerability profiles based on model configuration, underscoring the need for multi-faceted defence strategies. The study's implications are broad, particularly for sectors integrating LLMs into sensitive applications. Our results align with OWASP and MITREs security frameworks, highlighting the urgency for proactive measures, such as human oversight and trust boundaries, to protect against prompt injection risks. Future research should explore multilingual prompt injections and multi-step attack defences to enhance the resilience of LLMs in complex, real-world environments. This work contributes valuable insights into LLM vulnerabilities, aiming to advance the field toward safer AI deployments.
Keywords: Artificial Intelligence (AI), Prompt injections, AI security
1. Introduction
According to the Open Web Application Security Project (OWASP), rapid adoption of LLMs has outpaced the establishment of robust security protocols, leaving many applications exposed to high-risk vulnerabilities (OWASP, 2023). In response, OWASP gathered nearly 500 cybersecurity experts to analyse LLM threats, incorporating public input to create the OWASP Top 10 for LLM Applications, a respected list widely used in risk assessments (Flores and Monrea, 2024).
Prompt injections, ranked as the top vulnerability on OWASPs list, also appear in MITREs Adversarial Threat Landscape for AI Systems (ATLAS), a database documenting real-world AI attack tactics (MITRE ATLAS, 2023). Prompt injection attacks occur when adversaries craft malicious inputs to manipulate an LLMs behaviour, bypassing safety mechanisms to potentially leak sensitive information, generate harmful code, or spread misinformation. These...




