atheris-python-fuzzing
Author and run Atheris - Google's Python coverage-guided fuzzer built on libFuzzer. Covers pip installation, atheris.Setup + atheris.Fuzz invocation, TestOneInput(data: bytes) target signature, FuzzedDataProvider for structured input, instrument_imports() / instrument_func decorators for coverage instrumentation, and libFuzzer-passthrough flags (-atheris_runs, -max_total_time, -dict). Use for fuzzing Python libraries - also supports CPython native-extension fuzzing.
atheris-python-fuzzing
Overview
Atheris (per github.com/google/atheris) supports both pure-Python and native-extension targets (CPython C extensions).
For sanitiser pairing on native extensions, see sanitiser-integration-reference; for corpus discipline see corpus-management-reference.
When to use
Authoring
Install
pip install atherisPer the Atheris README, prebuilt wheels include libFuzzer for pure-Python fuzzing. Native-extension fuzzing may require building from source so the Clang and libFuzzer versions match.
Basic fuzz target
# fuzz_parser.py
import sys
import atheris
with atheris.instrument_imports():
from my_library import parser
def TestOneInput(data):
parser.parse(data)
atheris.Setup(sys.argv, TestOneInput)
atheris.Fuzz()Per Atheris README:
Coverage instrumentation
Atheris needs to instrument the modules under test:
with atheris.instrument_imports():
from my_library import parser, decoderThe instrument_imports() context manager monkey-patches the import system so subsequent imports are coverage-instrumented. Module-level imports above this context manager are NOT instrumented - the fuzzer is blind to their code.
Alternative: per-function instrumentation:
import my_library
my_library.parser.parse = atheris.instrument_func(my_library.parser.parse)Or instrument-all (heavyweight):
atheris.instrument_all()FuzzedDataProvider
For structured input, use the Python equivalent of libFuzzer's helper:
def TestOneInput(data):
fdp = atheris.FuzzedDataProvider(data)
port = fdp.ConsumeInt(4) # signed 4-byte int
is_https = fdp.ConsumeBool()
host = fdp.ConsumeUnicode(64) # up to 64 chars
body_size = fdp.ConsumeIntInRange(0, 1024)
body = fdp.ConsumeBytes(body_size)
parser.parse_request(host, port, is_https, body)Per Atheris README, the provider exposes ConsumeInt, ConsumeUnicode, ConsumeFloat, ConsumeBool, PickValueInList, and related methods.
Running
Basic run
python fuzz_parser.pyAtheris by default runs indefinitely. Pass libFuzzer-style flags:
python fuzz_parser.py -max_total_time=300 corpus/The trailing directory is the corpus (read + write). Subsequent directories are read-only seeds.
Common flags
Per Atheris README, all libFuzzer flags pass through:
| Flag | Effect |
|---|---|
-max_total_time=N | Stop after N seconds |
-atheris_runs=N | Run N iterations then stop (also enables coverage report) |
-dict=path | Use dictionary file |
-seed=N | Random seed |
-runs=N | libFuzzer runs (use -atheris_runs for Atheris-specific) |
Coverage report
python fuzz_parser.py -atheris_runs=100000 corpus/
# At end: prints coverage statisticsReproducing a crash
python fuzz_parser.py crash-<sha1>
# Same crash with full tracebackParsing results
Python tracebacks instead of sanitiser reports (unless instrumenting a CPython extension built with ASan):
[+] Loading binary contents from crash-abc123
=== Uncaught Python exception: ===
ValueError: invalid syntax
Traceback (most recent call last):
File "fuzz_parser.py", line 12, in TestOneInput
parser.parse(data)
File "/path/my_library/parser.py", line 47, in parse
return json.loads(text)
...Map the traceback to a bug spec via bug-report-from-failure.
CI integration
- uses: actions/setup-python@v6
with: { python-version: '3.12' }
- run: pip install atheris
- name: Smoke fuzz (3 min)
run: timeout 180 python fuzz_parser.py -max_total_time=180 corpus/ || true
- uses: actions/upload-artifact@v4
with:
name: atheris-crashes
path: crash-*Anti-patterns
| Anti-pattern | Why it fails | Fix |
|---|---|---|
Module imports above instrument_imports() | Coverage signal absent for those modules | Always import via with atheris.instrument_imports(): ... |
No exception handling in TestOneInput | Expected exceptions (ValueError on bad input) count as crashes | Catch expected exceptions; only let unexpected ones propagate |
| Pure-Python target without instrumentation | Coverage is blind; fuzzer flailing | Always instrument |
Missing atheris.Fuzz() call | Fuzz loop never starts | Always end with atheris.Fuzz() |
| Treating every traceback as a bug | Many tracebacks are spec-compliant (raising ValueError on invalid input is correct) | Use assert for invariants; let spec-defined exceptions through |
| Native extension without ASan | C bugs silent (segfault crashes Python interpreter) | Build CPython + extension with ASan for native fuzzing |