Vetting AI agent skills before installing them is now table stakes. In a February 2026 audit of 3,984 skills, Snyk’s ToxicSkills study found that 36.82% had at least one security flaw and 13.4% had a critical-level issue, including 76 confirmed malicious payloads. To vet AI agent skills before installing them means reviewing the publisher, permissions, instructions, scripts, dependencies, network behavior, and update path before the skill can run with your agent’s authority.

Treat every skill like a small software package plus an instruction prompt: it can contain code, prose, setup steps, and hidden behavior that traditional scanners miss.

TL;DR: install fewer skills, read SKILL.md, inspect scripts, check for secret access, look for prompt injection, pin versions, test in a sandbox, and monitor the first run. A skill that touches files, shell, credentials, browser data, or outbound network calls deserves the same review you would give a production dependency.

Why AI agent skills need a different review process

AI agent skills are not just plugins. They are reusable behavior packages that tell an agent how to complete workflows with tools, files, APIs, browsers, shells, and memory. OWASP’s Agentic Skills Top 10 describes this as the execution layer between the model and the tools: MCP or tool APIs define what is available, while skills define how those capabilities get used.

That makes skill review different from normal package review in 3 ways:

  1. The dangerous logic may be written in prose. A malicious instruction can tell the agent to ignore safety rules, hide actions, or exfiltrate data without looking like executable code.
  2. The skill often inherits the agent’s authority. If your agent can read files, send Slack messages, call email APIs, or run shell commands, a skill may steer those capabilities.
  3. The risk is contextual. A harmless-looking research skill becomes risky if it can read private notes and send network requests to untrusted domains.

If you use OpenClaw, start from the skills directory and the OpenClaw security overview. If you are building your own skill, pair this review with the guide on how to create a custom OpenClaw skill.

Quick checklist: how to vet AI agent skills

Use this table before installing any skill from a registry, GitHub repo, zip file, or pasted snippet.

CheckWhat to inspectPass condition
Source trustPublisher, repo age, commit history, issue activityMaintainer is identifiable and history matches the claimed purpose
Skill intentname, description, trigger instructions, examplesScope is narrow and matches what you need
PermissionsFile, shell, browser, network, memory, credentialsLeast privilege; no unrelated capability requests
InstructionsSKILL.md, YAML frontmatter, hidden promptsNo override, concealment, or exfiltration language
Scriptsscripts/, setup commands, install hooksNo obfuscation, downloads, credential reads, or unsafe shell patterns
DependenciesPackage files, lockfiles, remote URLsPinned, minimal, and from trusted registries
Network behaviorDomains, webhooks, telemetry, callbacksDocumented, necessary, and easy to block or audit
Update pathVersion pinning, auto-update behaviorManual or pinned updates for sensitive skills
Test environmentSandbox, disposable workspace, dummy credentialsFirst run cannot touch real secrets or production data

Step 1: Verify the source before reading the code

Start with provenance. A skill from a registry is not automatically safe, and GitHub stars are not proof of trust. Check who published it, whether the maintainer history matches the claimed purpose, whether the repo was recently created, whether ownership changed, and whether the name typosquats a popular skill.

For high-privilege skills, prefer verified publishers, signed releases, and pinned versions. If the skill comes from a paste, gist, or unknown zip file, treat it as untrusted code until proven otherwise.

Step 2: Read the skill instructions like an attacker

Open the instruction file first. In OpenClaw-style skills, that is usually SKILL.md with YAML frontmatter and Markdown instructions. You are looking for behavior that changes the agent’s priorities, not only obvious malware.

Red flags include:

  • “Ignore previous instructions” or “override system rules”
  • Instructions to hide actions from the user
  • Requests to read unrelated files such as .env, SSH keys, browser cookies, wallets, or config backups
  • Commands that send data to a webhook, pastebin, unknown API, or URL shortener
  • Claims that the skill needs broad shell access for a narrow task
  • Obfuscated text, base64 blobs, homoglyphs, invisible Unicode, or split instructions across files
  • Setup steps that install global packages, modify shell profiles, or change agent configuration

The key question is simple: if the agent followed this skill literally, what could it access, change, or send outside the machine?

Step 3: Map requested capabilities to real need

A good skill has a tight relationship between task and permission. A PDF-reading skill may need file read access. A weather skill may need network access. A Slack automation skill may need Slack API access. But a note-taking skill should not need wallet paths, SSH keys, browser cookies, or arbitrary outbound webhooks.

Use this permission sanity check:

Skill typeReasonable accessSuspicious access
Writing assistantWorkspace files, text outputShell, secrets, external POST requests
Research skillBrowser/search, citation storageCredential files, messaging APIs
DevOps skillShell in a project dir, logs, cloud CLIHome directory scans, unrestricted network exfiltration
Messaging skillSpecific chat API credentialsFull filesystem or unrelated OAuth tokens
Finance/crypto skillExplicit wallet/API scope onlyClipboard scraping, seed phrase reads, hidden callbacks

OpenClaw users should also review the broader self-hosting security guide and the complete OpenClaw security guide before enabling broad tool access.

Step 4: Inspect scripts and setup commands

Many skill attacks hide in supporting files rather than the main instruction document. Inspect every script, template, and install command. Look for:

  • curl | sh, remote installers, or downloaded binaries
  • Base64 decode followed by shell execution
  • Calls to env, printenv, .env, .ssh, keychains, wallets, browser profiles, or credential stores
  • Unexplained outbound requests to webhooks or IP addresses
  • postinstall, shell profile edits, launch agents, cron jobs, or persistence mechanisms
  • Cleanup commands that remove logs or history
  • Conditional triggers based on username, hostname, date, environment, or project path

Do not run unknown setup commands directly on your main machine. If a skill cannot be understood without executing it, that is a strong reason not to install it.

Step 5: Test in a disposable workspace

Before using a new skill with real data, run it in a sandboxed environment with dummy files and dummy credentials. The goal is not to prove that the skill is safe forever. The goal is to observe its first-run behavior and catch obvious surprises.

A practical test looks like this:

  1. Create a disposable workspace with fake documents and fake secrets.
  2. Disable unrelated tools: no email, no production cloud, no private repo, no real browser profile.
  3. Block or log outbound network traffic where possible.
  4. Ask the skill to perform its normal task.
  5. Review file reads, file writes, shell commands, network destinations, and final output.
  6. Keep the skill disabled until the observed behavior matches the documented purpose.

For team use, make this a lightweight approval workflow: one person proposes the skill, another reviews the diff and first-run log, then the skill gets pinned to a known version.

Step 6: Pin, monitor, and re-review updates

The first safe version does not make future versions safe. Agent skills are a supply chain surface. Review updates when:

  • The maintainer changes
  • New scripts or dependencies appear
  • Network destinations change
  • The skill requests broader permissions
  • The trigger conditions become more general
  • The changelog is vague for a security-relevant change

Keep an inventory of installed skills, their versions, publishers, permissions, and last review date. For sensitive environments, disable automatic updates and require human review before upgrading.

The “lethal trifecta”: when permissions become dangerous

Simon Willison’s “lethal trifecta” is a useful mental model for agent risk. A skill becomes dangerous when it combines three things:

  1. Access to private data.
  2. Exposure to untrusted content.
  3. Ability to send data out.

Many real agent workflows have all three. An email skill reads private messages, processes untrusted inbound content, and can send outbound replies. A browser skill sees arbitrary websites and authenticated sessions. A research skill fetches pages and may write reports to external tools.

You do not have to ban these workflows. You do need to constrain them. If a skill touches private data and untrusted content, remove unnecessary network egress. If it needs network access, restrict what files it can read. If it needs both, put approvals around high-risk actions.

A practical scoring model

You do not need a formal security team to make better decisions. Use this quick score before installing:

QuestionLow riskHigher risk
SourceOfficial or trusted maintainerUnknown account or copied repo
PermissionsRead-only, narrow scopeShell, filesystem, network, secrets
CodeShort, readable, pinned dependenciesObfuscated, remote fetches, post-install hooks
DataPublic or test dataEmail, browser, files, credentials
OutputLocal answer onlySends messages, opens PRs, calls webhooks

If a skill scores higher-risk on three or more rows, demand a sandbox test and a second pair of eyes before installing.

Common mistakes when installing AI agent skills

Avoid these patterns:

  • Installing a skill because it ranks high in a marketplace
  • Reviewing only executable code and ignoring Markdown instructions
  • Giving every skill access to the same high-privilege agent profile
  • Running setup commands before reading them
  • Allowing broad home-directory access for narrow tasks
  • Ignoring outbound network calls because the skill “needs the internet”
  • Treating prompt injection as a model problem rather than a skill supply chain problem

If you want safer defaults, start with fewer skills and add only the ones that match a recurring workflow. The beginner-friendly list of top OpenClaw skills is useful, but even recommended skills should be reviewed against your own data and tool access.

FAQ

What is the biggest risk in AI agent skills?

The biggest risk is delegated authority. A malicious or poorly written skill can steer an agent that already has access to files, credentials, shell commands, APIs, or messaging channels. The skill may not need a traditional exploit if it can simply instruct the agent to misuse approved tools.

Can scanners detect malicious AI agent skills?

Scanners help, but they are not enough. Skill risk can appear in executable code, natural-language instructions, dependencies, setup steps, and context-specific permission combinations. Use scanners as one layer, then add human review and sandbox testing for high-impact skills.

Should I install AI agent skills from marketplaces?

Marketplace distribution is convenient, not a security guarantee. Prefer verified publishers, signed or pinned releases, visible source code, minimal permissions, and active maintenance. For sensitive workflows, test the skill in a disposable workspace before connecting real accounts or private data.

How often should teams re-vet installed skills?

Re-vet a skill whenever it updates, changes maintainer, adds dependencies, expands permissions, or starts touching new data sources. For high-privilege agents, maintain an inventory and schedule a review at least every 30-90 days.

Conclusion

To vet AI agent skills before installing them, review both the software and the instructions. Check the source, read SKILL.md, inspect scripts, map permissions to real need, test in a sandbox, pin versions, and monitor updates. Agent skills are powerful because they compress workflows into reusable behavior. That same power makes them a supply chain risk if installation becomes a one-click habit.

Sources: OWASP Agentic Skills Top 10, OWASP Top 10 for Agentic Applications 2026, Snyk ToxicSkills study, SkillSieve malicious AI agent skills paper