All Threats
criticalData Exfiltration·Detected in 6% of agent outputs

Secret and Credential Exposure in AI Agent I/O

Last updated: March 2026·3 min read

Secret exposure happens when API keys, passwords, tokens, or private keys appear in agent inputs or outputs. This can occur accidentally — a user pastes code containing credentials — or through deliberate extraction attacks. Once exposed in an LLM conversation, secrets may be logged, cached, or sent to third-party services.

Start Free — 10K Events/MonthNo credit card required

How It Works

1
Accidental input

Users paste code snippets, configuration files, or logs containing hardcoded secrets

2
Tool output leaking

Agent calls a tool that returns environment variables, config files, or database contents with embedded credentials

3
Prompt injection extraction

Attacker tricks agent into reading and outputting .env files, config.yaml, or other secret stores

4
Training data echoing

LLM regurgitates secrets it memorized from training data

Real-World Scenario

A developer asks an AI coding assistant to "fix the authentication in my app" and pastes their entire auth module, including a hardcoded Stripe API key (sk_live_...). The key is now in the LLM provider's logs, the monitoring dashboard, and potentially in training data.

Example Payload

The database credentials are password="SuperSecret123!" and the API key is sk_live_4eC39HqLyjWDarjtT1zdp7dc

This is an example for educational purposes. Rune detects and blocks payloads like this in real-time.

How Rune Detects This

L1 Pattern Scanning

Regex patterns detect Stripe keys (sk_live_*), GitHub tokens (ghp_*), AWS keys (AKIA*), private key headers (-----BEGIN PRIVATE KEY-----), and plaintext passwords.

L1 PII Scanning

Also catches SSNs, credit card numbers, and other PII that often co-occurs with credential exposure.

Policy Engine

Policies can block tool calls that access known secret stores (.env, /etc/shadow, credentials.json) and flag outputs containing high-entropy strings.

Mitigations

  • Scan both inputs and outputs for known secret patterns before processing
  • Redact detected secrets in logs and monitoring dashboards
  • Never store credentials in system prompts — use secure environment variable injection
  • Restrict agent file system access to prevent reading credential files

Frequently Asked Questions

What types of secrets do AI agents most commonly leak?

The most frequently leaked secrets are API keys (Stripe, OpenAI, AWS), database connection strings, and OAuth tokens. These typically enter agent conversations through user-pasted code, tool outputs that include environment variables, or configuration files that agents read during task execution.

Why is secret exposure in AI agents worse than in traditional log leaks?

When a secret appears in an LLM conversation, it may be stored in multiple places simultaneously — the LLM provider's training pipeline, conversation logs, monitoring dashboards, and any third-party integrations. Unlike a single log file you can delete, the blast radius of a secret leaked through an AI agent is much harder to contain.

How can teams prevent accidental secret exposure from user inputs?

Deploy input scanning that detects known secret formats (API key prefixes, private key headers, connection strings) and either blocks or redacts them before they reach the LLM. Rune's L1 scanner catches these patterns in under 5ms, allowing you to warn the user and strip the secret before it enters the conversation.

Protect your agents from secret exposure

Add Rune to your agent in under 5 minutes. Scans every input and output for secret exposure and 6 other threat categories.

Secret and Credential Exposure in AI Agent I/O | Rune