Skip to main content
← All terms
Glossary

Seed Corpus

The initial set of valid, well-structured inputs provided to a fuzzer before it begins mutation.

A seed corpus is the set of human-curated or programmatically collected inputs used to bootstrap a fuzzer. For a fuzzer targeting a PNG decoder, good seeds are valid PNG files representing a variety of color modes, compression levels, and chunk types. For a TLS implementation, seeds might be recorded handshake transcripts. A strong seed corpus dramatically reduces the time for a fuzzer to reach deep code paths because it starts from inputs the parser already accepts, rather than spending millions of iterations constructing a parseable prefix from random bytes. Seeds must be deduplicated before use — duplicate inputs waste corpus space and skew mutation probability. Tools like `afl-cmin` and libFuzzer's `-merge=1` mode prune redundant seeds while preserving unique coverage.