Secrets Detection¶

SecretsDetector scans text for 37+ credential patterns and prevents accidental secret leakage into logs, audit trails, provider API payloads, and chat output. SecretCensor (the redaction layer) replaces detected secrets with [REDACTED].

Secrets in AI contexts

LLM agents process user input, tool output, and file contents. Any of these can contain credentials. Without secrets detection, an API key in a config file could be sent to a provider API, logged in plaintext, or echoed back to a chat channel.

Detection Pipeline¶

graph LR
    A[Input Text] --> B[SecretsDetector.scan]
    B --> C{Findings?}
    C -->|Yes| D[Sort by position]
    D --> E[Merge overlapping spans]
    E --> F["Replace with [REDACTED]"]
    C -->|No| G[Pass through unchanged]

Detected Credential Types¶

The detector covers 37+ distinct credential patterns across major cloud providers, SaaS platforms, and common secret formats.

Cloud Providers¶

Type	Pattern	Example Match
AWS Access Key	`AKIA[0-9A-Z]{16}`	`AKIAIOSFODNN7EXAMPLE`
AWS Secret Key	`aws_secret_access_key` + 40 chars	`aws_secret_access_key = wJalrXUtn...`
GCP API Key	`AIza[A-Za-z0-9_-]{35}`	`AIzaSyA1234567890abcdefghijklmnop`
Azure Key	`AccountKey` + base64	`AccountKey=abc123...`

AI Providers¶

Type	Pattern	Example Match
Anthropic Key	`sk-ant-[A-Za-z0-9_-]{20,}`	`sk-ant-api03-abc123...`
OpenAI Key	`sk-(proj-)?[A-Za-z0-9_-]{20,}`	`sk-proj-abc123...`
HuggingFace Token	`hf_[A-Za-z0-9]{34,}`	`hf_abcdefghij1234567890...`

Code Hosting and CI/CD¶

Type	Pattern	Example Match
GitHub PAT	`ghp_[A-Za-z0-9]{36}`	`ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`
GitHub OAuth	`gho_[A-Za-z0-9]{36}`	`gho_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`
GitLab Token	`glpat-[A-Za-z0-9_-]{20,}`	`glpat-xxxxxxxxxxxxxxxxxxxx`
NPM Token	`npm_[A-Za-z0-9]{36}`	`npm_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx`
PyPI Token	`pypi-[A-Za-z0-9_-]{50,}`	`pypi-AgEIcHlwaS5v...`

Communication and SaaS¶

Type	Pattern	Example Match
Slack Token	`xox[baprs]-...`	`xoxb-123-456-abc`
Discord Token	Bot token format	`MTA1...abc.def.ghi...`
SendGrid Key	`SG.[base64].[base64]`	`SG.xxxx.yyyy`
Stripe Key	`[sr]k_(live\\|test)_...`	`sk_live_abc123...`
Twilio Key	`SK[a-f0-9]{32}`	`SK1234567890abcdef...`
Mailgun Key	`key-[a-f0-9]{32}`	`key-1234567890abcdef...`

Infrastructure and Databases¶

Type	Pattern	Example Match
DB Connection String	`postgres://user:pass@host`	`postgres://admin:secret@db:5432`
HashiCorp Vault Token	`hvs\\|hvb\\|hvr.[base64]`	`hvs.abc123...`
Databricks Token	`dapi[a-f0-9]{32}`	`dapi1234567890abcdef...`
DigitalOcean Token	`dop_v1_[a-f0-9]{64}`	`dop_v1_abcdef...`
PlanetScale Token	`pscale_tkn_[base64]`	`pscale_tkn_abc123...`
Render Key	`rnd_[A-Za-z0-9]{32,}`	`rnd_abc123...`
Fly.io Token	`FlyV1 [base64]`	`FlyV1 abc123...`

Cryptographic Material¶

Type	Pattern	Example Match
Private Key	`-----BEGIN (RSA\\|EC\\|DSA\\|OPENSSH) PRIVATE KEY-----`	PEM headers
JWT	`eyJ...eyJ...` (3 base64 segments)	`eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOi...`
SSH Key Content	`AAAA[BCD][base64]{100+}`	SSH public key body
Age Secret Key	`AGE-SECRET-KEY-[A-Z0-9]{59}`	`AGE-SECRET-KEY-1ABC...`

Observability and Monitoring¶

Type	Pattern	Example Match
Grafana Token	`glc_[base64]`	`glc_abc123...`
Datadog Key	`dd_api_key` + hex	`dd_api_key=abcdef...`
New Relic Key	`NRAK-[A-Z0-9]{27}`	`NRAK-ABC123...`
PagerDuty Key	`pagerduty_token` + value	`pagerduty_token=abc...`
Sentry DSN	`https://[hex]@[host].ingest.sentry.io/[id]`	Full Sentry DSN URL

Generic Patterns¶

Type	Pattern	Description
`api_key`	`api_key/apikey` + 20 chars	Generic API key assignments
`password`	`password/passwd/pwd` + 8 chars	Password assignments
`token`	`token/secret` + 20 chars	Generic token assignments
`google_oauth_secret`	`client_secret` + 24 chars	OAuth client secrets

Overlap Merging¶

When multiple patterns match overlapping regions of text, the detector merges spans before redaction to prevent partial secret leakage:

Text:    "api_key=sk-ant-api03-very-long-key-here"
Match 1: [--------api_key match--------]
Match 2:         [---anthropic_key match---]
Merged:  [----------single span-----------]
Result:  "[REDACTED]"

Without merging, overlapping replacements could leave fragments of the secret visible.

Usage¶

Scanning¶

from missy.security.secrets import secrets_detector

findings = secrets_detector.scan(text)
# Returns: [
#   {"type": "anthropic_key", "match_start": 42, "match_end": 89},
#   {"type": "api_key", "match_start": 42, "match_end": 85},
# ]

Redaction¶

safe_text = secrets_detector.redact(text)
# All detected secrets replaced with [REDACTED]

Quick Check¶

if secrets_detector.has_secrets(text):
    # Short-circuit: stops at first match
    handle_secret_detected()

Module Singleton¶

The secrets_detector instance is a process-level singleton. Import and use directly:

from missy.security.secrets import secrets_detector

Integration Points¶

The secrets detector is used at multiple points in Missy's processing pipeline:

Location	Purpose
Agent runtime	Scan tool outputs before passing to the model
CLI channel	Redact secrets from displayed responses
Audit logger	Prevent secrets in audit trail
Provider payloads	Scan outbound messages
Memory store	Prevent secret persistence in conversation history

Pattern-based limitations

Secrets detection is regex-based and cannot catch every possible secret format. Custom or proprietary credential formats may not be detected. Use the Vault to store secrets securely rather than relying solely on detection to prevent leakage.