kingfisher-scanning

Configures and runs Kingfisher for secret scanning with access mapping: discovers leaked credentials AND maps them to the IAM identities and cloud resources they expose (S3 buckets, RDS instances, etc.); Intel Hyperscan regex engine makes it the fastest option for large monorepos; 950 built-in rules (largest of the OSS scanners); multi-target (local files / Git history / GitHub / GitLab / AWS S3 / Docker images); live API validation plus offline checksum verification; suppression via `--skip-regex` / `--skip-word` / `--baseline-file` / inline `kingfisher:ignore`. Use when cloud-blast-radius context matters or scan time on a large repo is blocking. Front-loads access-mapping and Hyperscan speed to differentiate from trufflehog-scanning, which also does multi-target scanning and live validation but offers no IAM access mapping.

kingfisher-scanning

Overview

Per github.com/mongodb/kingfisher:

"Kingfisher is an open source secret scanner and live secret validation tool built in Rust."

Three differentiating capabilities:

Intel Hyperscan + language-aware parsing - significantly faster than regex-only scanners on large repos.
950 built-in rules covering "cloud keys, AI tokens, CI/CD secrets, database credentials, and SaaS API keys" per kf-gh. Largest rule set of the three OSS scanners.
Access mapping - "Maps discovered credentials to their effective cloud identities and exposed resources." Beyond "is it a real secret?" → "what does this secret unlock?".

When to use

Large monorepo where gitleaks/trufflehog scan-time becomes blocking.
Team prioritizes rule coverage over tool maturity (Kingfisher is newer than gitleaks/trufflehog).
Cloud-heavy infrastructure where access mapping (which AWS identity does this leaked key represent?) is valuable.
Modern Rust-toolchain shop where adopting Kingfisher fits stack preference.

For battle-tested defaults, gitleaks-scanning or trufflehog-scanning are lower-risk picks.

Step 1 - Install

Per kf-gh:

# Homebrew (Linux/macOS)
brew install kingfisher

# PyPI via uv
uv tool install kingfisher-bin

# Install script (Linux/macOS)
curl -sSL https://raw.githubusercontent.com/mongodb/kingfisher/main/scripts/install-kingfisher.sh | bash

# Docker
docker run --rm -v "$PWD":/src ghcr.io/mongodb/kingfisher:latest scan /src

Step 2 - Basic scan

Per kf-gh:

kingfisher scan /path/to/code --view-report

The --view-report flag opens results in the browser viewer (useful for triage; less useful in CI).

For CI:

kingfisher scan /path/to/code --output kingfisher-report.json --format json

Multi-target scanning per kf-gh:

# Git history
kingfisher git /path/to/repo

# GitHub organization
kingfisher github --org=acme

# AWS S3
kingfisher s3 --bucket my-bucket

# Docker image
kingfisher docker my-image:latest

(Verify exact subcommand syntax against current Kingfisher release - the surface evolves.)

Step 3 - Live validation

Per kf-gh: "Confirms discovered secrets against provider APIs to reduce false positives."

Same model as TruffleHog's verification - discovered candidates are tested against the provider API to confirm they're real, unexpired credentials. The output distinguishes verified vs unverified findings; CI gating typically uses verified-only for PR-blocking.

Plus per kf-gh: "Validates tokens with built-in checksums offline, eliminating many false positives." Some providers (AWS keys, Stripe keys) include checksums that Kingfisher validates without making API calls - fast + no audit-log impact.

Step 4 - Access mapping (Kingfisher-distinctive)

Per kf-gh: "Maps discovered credentials to their effective cloud identities and exposed resources."

When a leaked AWS key is discovered, Kingfisher can (with appropriate permissions) report:

The IAM user / role the key belongs to
The IAM policies attached
The actions the key can perform
The resources the key can access

This turns a finding from "leaked AWS key" into "leaked AWS key that could s3:GetObject from prod-customer-data bucket" - a much sharper severity signal.

This requires Kingfisher to have read-only API access to the cloud account; configure via standard cloud-SDK credential mechanisms.

Step 5 - False-positive triage (MANDATORY)

Per kf-gh suppression options:

Mechanism	Use
`--skip-regex '(?i)PATTERN'`	Skip matches matching a regex pattern
`--skip-word WORD`	Exclude specific words from detection
`--baseline-file <path>`	Suppress known findings tracked in a baseline
Inline `kingfisher:ignore`	In-code per-line suppression

Baseline workflow:

# Create baseline (current state of findings)
kingfisher scan /path/to/code --output kingfisher-baseline.json

# Apply baseline (only NEW findings fail)
kingfisher scan /path/to/code --baseline-file kingfisher-baseline.json

Justification template (mandatory in baseline file or sibling REASONS.md):

# kingfisher-baseline-reasons.md
# Each entry in kingfisher-baseline.json should have an entry here.

GHSA-2024-aws-key-fixture:
  files: [tests/fixtures/aws-credentials.json]
  reason: "Test fixture; uses dummy AWS credentials for SDK init tests"
  approved-by: alice@example.com
  re-review-date: 2026-09-15

For inline:

# kingfisher:ignore
# Reason: SDK initialization fixture; not a real credential
# Approved-by: alice@example.com
# Re-review-date: 2026-09-15
DUMMY_AWS_KEY = "AKIAIOSFODNN7EXAMPLE"

Cadence: every quarter, audit kingfisher-baseline-reasons.md + inline kingfisher:ignore comments; expired re-review dates removed.

Step 6 - CI integration

Kingfisher is newer; first-party CI integrations are still maturing. Pattern (verify against current docs):

jobs:
  kingfisher:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5
        with: { fetch-depth: 0 }
      - run: |
          curl -sSL https://raw.githubusercontent.com/mongodb/kingfisher/main/scripts/install-kingfisher.sh | bash
      - run: kingfisher scan . --baseline-file .kingfisher-baseline.json --format json --output kingfisher.json
      - uses: actions/upload-artifact@v4
        if: always()
        with: { name: kingfisher-report, path: kingfisher.json }

Step 7 - Cross-tool layering

Three secret scanners in this plugin overlap deliberately:

Tool	Sweet spot
gitleaks	Pre-commit (fastest, mature)
trufflehog	High-precision verified findings, multi-source
kingfisher	Largest rule set + access mapping; modern stack

For maximum coverage, run all three in CI and combine output via a unified triager. For minimum coverage, gitleaks (pre-commit) + trufflehog (CI verified-only) is the conservative pick.

Anti-patterns

Anti-pattern	Why it fails	Fix
Run kingfisher without baseline	Legacy findings flood PR	Baseline + diff (Step 5)
Suppress without REASONS doc	No audit trail	Mandatory template (Step 5)
Skip access-mapping config	Findings lack severity context	Configure cloud read-only access (Step 4)
Trust unverified findings as real	False positives drown signal	Verified-only filter (when supported by current version)
Pin to `kingfisher:latest` Docker tag	Breaking changes mid-release	Pin specific version (`v0.5.0` etc.)

Limitations

Newer than gitleaks (2018) and trufflehog (2017); ecosystem smaller (fewer plugins, less StackOverflow coverage).
Access mapping requires cloud read-only access - operational setup overhead.
Hyperscan dependency (Intel x86) - limited ARM coverage in early releases (verify current support).
Some custom-rule patterns require Rust-friendly regex (no PCRE-only features).

References

kf-gh - repository, install, scan commands, key features
mongodb.com/blog (search "kingfisher") - release announcements
gitleaks-scanning, trufflehog-scanning - sister scanners
secrets-rotation-runner - rotation workflow after detection