Testland
Browse all skills & agents

libfuzzer-cpp

Author and run LLVM libFuzzer for C/C++ - in-process coverage-guided fuzzing. Covers harness authoring (LLVMFuzzerTestOneInput entry point), build with -fsanitize=fuzzer,address,undefined, runtime flags (-max_total_time, -runs, -dict, -fork, -workers), corpus + crash-artefact handling, and CI integration. Use for libraries / parsers / decoders in C/C++ where in-process fuzzing of a function is the right scope. Compose with ASan + UBSan from sanitiser-integration-reference and corpus discipline from corpus-management-reference.

libfuzzer-cpp

Overview

This skill wraps LLVM's libFuzzer (per llvm.org/docs/LibFuzzer.html) for C/C++ targets. Composes with:

When to use

  • Fuzzing a C / C++ library function (parser, decoder, validator).
  • Targeting a specific function - in-process fuzzing is faster than out-of-process AFL.
  • Pairing with ASan + UBSan for memory-safety + UB detection.

Authoring

The fuzz target

Define the entry point LLVMFuzzerTestOneInput:

#include <cstddef>
#include <cstdint>
#include "your_library.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    your_parser(Data, Size);
    return 0;
}

Per LLVM docs, the function signature is fixed: takes a const byte buffer + size, returns int (must return 0 for normal execution; non-zero values are reserved).

The fuzzer calls this function repeatedly with mutated Data. The target's job is to drive the library code under test and let sanitisers + asserts catch bugs.

Initialisation

Optional one-time setup:

extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv) {
    your_library_init();
    return 0;
}

Build

Standard build flag:

clang -g -O1 \
  -fsanitize=fuzzer,address,undefined \
  -fno-sanitize-recover=all \
  -fno-omit-frame-pointer \
  fuzz_target.cc your_library.cc -o fuzz_target

Per sanitiser-integration-reference: ASan + UBSan is the default pair; add MSan in a separate binary if needed.

Tips for an effective target

TipWhy
Keep the target smallFaster iteration; clearer coverage
Avoid global state between runsCross-input contamination defeats coverage guidance
Use the full inputDon't if (Size < 100) return 0; unless the lib requires
Avoid expensive I/O / network in the targetSlows iterations
Use FuzzedDataProvider for structured inputsSplits Data into typed sub-values

FuzzedDataProvider (from LLVM's compiler-rt/include/fuzzer/FuzzedDataProvider.h):

#include <fuzzer/FuzzedDataProvider.h>

extern "C" int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
    FuzzedDataProvider fdp(Data, Size);
    int port = fdp.ConsumeIntegralInRange(1, 65535);
    std::string host = fdp.ConsumeRandomLengthString(64);
    std::vector<uint8_t> body = fdp.ConsumeRemainingBytes<uint8_t>();
    your_request_handler(host, port, body);
    return 0;
}

Running

Basic run

mkdir corpus/ seeds/
# Populate seeds/ with hand-curated inputs
./fuzz_target -max_total_time=3600 corpus/ seeds/

The first directory is writable (evolved corpus); subsequent are read-only seeds (per corpus-management-reference).

Common flags

Per llvm.org/docs/LibFuzzer.html:

FlagEffect
-max_total_time=NStop after N seconds
-runs=NStop after N executions (-1 = infinite)
-dict=pathUse dictionary file
-seed=NRandom seed
-fork=NRun N parallel fork-mode workers
-workers=NNumber of parallel worker processes
-jobs=NTotal number of jobs to run across workers
-merge=1Corpus minimisation mode
-print_final_stats=1Print stats summary on exit
-rss_limit_mb=NRSS memory limit (default 2048)
-timeout=NPer-input timeout in seconds (default 1200)
-only_ascii=1Restrict to ASCII bytes

Parallel fuzzing

./fuzz_target -fork=8 -max_total_time=3600 corpus/ seeds/

-fork=N spawns N processes, each with its own corpus subset. Combine corpora periodically with -merge=1.

Reproducing a crash

./fuzz_target crash-<sha1>
# Sanitiser report prints to stderr; same as the original crash

Minimise the crash input:

./fuzz_target -minimize_crash=1 -runs=10000 crash-<sha1>
# Writes minimized-from-crash-<sha1> with the smallest reproducer

Dictionary file

For structured formats:

# fuzz.dict
"{"
"}"
"["
"]"
"true"
"false"
"null"
"\":\""

Invoke: ./fuzz_target -dict=fuzz.dict corpus/.

Parsing results

libFuzzer crash artefacts are saved as:

  • crash-<sha1> - segfault / sanitiser-detected error
  • leak-<sha1> - memory leak detected by LSan
  • timeout-<sha1> - exceeded -timeout
  • oom-<sha1> - RSS exceeded -rss_limit_mb

Each file's contents are the input bytes that triggered the crash. Pair with the sanitiser report (stderr) for stack + allocation site.

For automated parsing (e.g., file as a bug), feed the sanitiser-report output to bug-report-from-failure:

./fuzz_target crash-<sha1> 2> sanitiser-report.txt
python scripts/file-bug-from-asan.py sanitiser-report.txt crash-<sha1>

CI integration

Short smoke fuzz on every PR:

jobs:
  fuzz:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v5
      - name: Install clang
        run: sudo apt-get install -y clang lld
      - name: Build fuzz target
        run: |
          clang++ -g -O1 \
            -fsanitize=fuzzer,address,undefined \
            -fno-sanitize-recover=all \
            -fno-omit-frame-pointer \
            fuzz/fuzz_target.cc lib/parser.cc -o fuzz_target
      - uses: actions/cache@v4
        with:
          path: fuzz/corpus
          key: fuzz-corpus-${{ github.sha }}
          restore-keys: fuzz-corpus-
      - name: Smoke fuzz (5 min)
        run: ./fuzz_target -max_total_time=300 fuzz/corpus fuzz/seeds
      - name: Upload crashes
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: crashes
          path: |
            crash-*
            leak-*
            timeout-*
            oom-*

For long-running campaigns, ossfuzz-integration is the canonical infrastructure.

Anti-patterns

Anti-patternWhy it failsFix
LLVMFuzzerTestOneInput with global state mutationCross-input contamination breaks coverage signalReset state per call or use LLVMFuzzerInitialize
Fuzz target without ASan + UBSanCatches only crashes; 80%+ of bugs missedAlways compose with sanitisers
No corpus minimisation everCorpus grows unbounded; cycle time degradesWeekly -merge=1
Crash committed without minimisationLarge bug-report attachmentsAlways -minimize_crash=1
Single huge fuzz targetSlow iterations; coverage attribution opaqueSplit into multiple targets per function
Ignoring -rss_limit_mb OOMsFalse crash classSet limit explicit; or disable allocator-related target paths
No dictionary for structured formatsFuzzer slowly rediscovers grammarAlways supply -dict= for JSON / XML / SQL / proto

Limitations

  • In-process only. Doesn't fuzz inter-process boundaries; for network protocols see AFL++ in -Q (QEMU) mode or specialised tools.
  • C / C++ + Rust + Swift only. Other languages have their own fuzzers (Atheris, Jazzer, Go native).
  • Coverage instrumentation overhead. Hot inner loops slow significantly under -fsanitize=fuzzer.
  • Crash uniqueness heuristic. libFuzzer dedup is sha1-based on the input; the same bug from two inputs creates two artefacts - pair with crash-stack-deduplication tooling.
  • No coverage report by default. Use -coverage flag or external llvm-profdata + llvm-cov for line-level coverage.

References