Best Paper Award @ Workshop on Agent Skills

Best Paper Award @ Workshop on Agent Skills

Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem (31.05.2026)

Förderjahr 2025 / Stipendium Call #20 / ProjektID: 7761 / Projekt: Hidden Dangers: Uncovering Security and Privacy Risks through Large-scale Mobile App Analysis

Our paper Context Matters: Repository-Aware Security Analysis of the Agent Skill Ecosystem, from Florian Holzbauer, David Schmidt, Gabriel K. Gegenhuber, Sebastian Schrittwieser, and Johanna Ullrich received the best paper award at the first Workshop on Agent Skills co-located with the ACM Conference on AI and Agentic Systems.

The functionality of AI agents such as Claude Code and OpenClaw is increasingly extended through so-called agent skills. These skills are distributed through dedicated marketplaces and GitHub repositories, forming a new software ecosystem that is similar to early app stores and package managers.

However, this new ecosystem also introduces security risks. Skills combine natural-language descriptions with executable logic, meaning that an agent may decide autonomously when to invoke code provided by third parties. After initial reports about malicious skills, marketplaces started to integrate scanners to report whether they appear benign or malicious. However, current scanners often classify a very large fraction of skills as suspicious, with reported maliciousness rates of up to 46.8%. In this work, we systematically analyzed the security of the emerging agent skill ecosystem. We collected 238,180 unique skills from three major distribution platforms and GitHub. We noticed that security scanners evaluate skills often in isolation, even though in two (out of three) marketplaces they are embedded in larger GitHub projects. A skill that looks suspicious on its own may be completely expected in the context of a security tool, developer utility, or automation framework. To address this, we developed a repository-aware analysis approach that compares scanner-flagged skills with their surrounding project documentation, codebase, and repository metadata.

Our results show that context matters. Among 2,887 scanner-flagged skill and repository combinations that we evaluated with repository context, only 15 remained suspicious. This corresponds to 0.52%. In other words, many scanner alerts appear to be false positives caused by analyzing the skill without understanding the project it belongs to. We further observed that existing scanners often disagree. On a common set of Skills.sh skills, most skills flagged by at least one scanner were flagged by only a single tool, while only 33 skills were flagged by all five evaluated scanners. This highlights that scanner outputs should be interpreted as risk signals rather than definitive labels.

At the same time, we also show that the marketplaces are free of risks. We found functional credentials embedded in published skills, including API tokens and database credentials. We also identified repository hijacking risks affecting 121 skills. These risks arise if an abandoned repository namespace on GitHub can be re-registered, an attacker may be able to take over a skill reference that users trust.

Link to the paper: https://openreview.net/attachment?id=4EmqWDqM4H&name=pdf

Link to the artifact: https://github.com/holzsec/repository-context-agentskills

David Schmidt

Weitere Blogbeiträge

Analyzing the iOS Local Network Permission from a Technical and User Perspective

We analyzed the iOS local network permission from a technical and user perspective.

Distinguished Paper Award at CCS

Our work, “Leaky Apps: Large-scale Analysis of Secrets Distributed in Android and iOS Apps,” was honored with the Distinguished Paper Award at ACM CCS 2025 (ACM SIGSAC Conference on Computer and Communications Security).