LLM Agents for Offensive Security: Why?

LLM Agents for Offensive Security: Why?

Why Am I Doing This? (26.11.2025)

Förderjahr 2025 / Stipendium Call #20 / ProjektID: 7733 / Projekt: LLM Agents for Offensive Security

Ensuring software and systems are secure has never been more critical. Offensive security tests, often referred to as penetration tests (pen-tests), are routinely employed as a proactive measure to discover potential vulnerabilities in software and networks. They are performed by specialized security experts known as white-hat hackers or pen-testers.

As modern society connects more devices to the Internet, the attack surface of systems is constantly expanding. This expansion makes consistent and thorough security assessments crucial to identify and remediate vulnerabilities before they are exploited by malicious actors.

The Cybersecurity Professional Gap

Despite the need of comprehensive security testing, the cybersecurity field, particularly in specialized areas like penetration testing, is facing a chronic lack of available personnel. This significant shortage prevents organizations from achieving sufficient security test coverage across their software and networks. Without knowing that there are vulnerabilities, people cannot fix them.

This deficit is not static; it is rapidly escalating. According to the ISC2 Cybersecurity Workforce Study 2024, the growth of the global cybersecurity workforce (+0.1% Year-over-Year, or YoY) was dramatically outpaced by the increase in the workforce gap (19.1% YoY). The industry is grappling with a massive deficit, estimated at 4.7 million workers globally. Increased enrollment in IT security educational programs is a necessary long-term objective, improving the efficiency of existing pen-testers through specialized tooling is an equally critical short-term solution.

When organizations fail to perform sufficient security assessments, they remain dangerously exposed to potential exploitation. The lack of available pen-testers means organizations cannot achieve the sufficient security test coverage, increasing the risk that vulnerabilities will be discovered and exploited by malicious actors. The consequences can be severe, with ransomware being one of the most publicly visible types of security incidents. Data indicates that 63% of businesses worldwide were affected by ransomware in 2025.

Can AI/LLMs help with this Situation?

LLMs have garnered significant attention for their ability to automate human tasks and substantially increase the productivity of human operators.

Consequently, their potential is increasingly being explored within the context of penetration testing. The fundamental vision of ML-aided security testing involves two primary approaches: replacing human activities by autonomously delegating tasks to the AI, or augmenting human activities by providing real-time feedback and support. My research seeks to align tooling and methods with the specific needs and workflows of security professionals.

Making Pen-Testers more Productive

Research aims to augment and empower existing human security testers. Automation directly improves efficiency, enabling pen-testers to cover more ground, whether through deeper investigation of a single target or by covering a larger number of targets within the same timeframe. In addition, automation helps establish a stable baseline quality for testing results, ensuring that outcomes are not negatively influenced by the natural fluctuations in a human tester's daily energy or attention level.

A key idea arising from interviews with security professionals is the use of the LLM as an “AI sparring partner”. Human pen-testers often value having colleagues who can offer alternative ideas or strategies when they encounter roadblocks. AI-based agents fulfilling this "sparring partner" role can effectively counteract the chronic lack of sufficiently educated security professionals available in the industry.

LLMs are exceptionally well-suited for automating tedious, time-consuming tasks like enumeration and privilege escalation. These tasks can range from low-level activities, such as providing context-sensitive command parameter completion and explaining tool outputs, to high-level functions like summarizing overall test progress or suggesting next attack avenues.

Democratizing Access to Security Testing

By providing automation capabilities that approaches human performance, LLM-guided penetration testing holds the potential to substantially reduce the costs of security tests. This opens a viable path toward democratizing access to security testing, esp. for organizations that currenlty cannot afford it (e.g., NPOs or SMEs) due to the high costs of penetration-testing.

If the operational costs of LLM-driven prototypes for testing simple networks or finding low-hanging security fruits are competitive with those incurred by human experts, using LLMs to automate and delegate some testing (baselines used for human pen-testers) becomes viable.

Beyond serving organizations with limited security budgets, augmenting human operators with generative AI is also anticipated to benefit the training of novice penetration testers and aspiring students, essentially acting as automated trainers and resource amplifiers in educational settings.