Skip to main content
← All libraries
Compression · C

How to fuzz xz-utils

After the 2024 backdoor incident, xz-utils is the most scrutinised compression library in existence.

CVE-2024-3094 proved that xz-utils is present in essentially every Linux distribution and is a high-value supply-chain target. Beyond backdoors, its LZMA2 stream parser involves complex block-header arithmetic and filter-chain composition that benefit from continuous automated testing.

Common bug classes

  • Heap buffer overflow in LZMA2 chunk header length arithmetic
  • Integer overflow in multi-block index record parsing
  • Out-of-bounds read in BCJ filter branch offset computation
  • Use-after-free in lzma_stream cleanup on partial-init failure
  • Divide-by-zero in dictionary size validation for tiny presets

Recommended setup

Fuzzers

  • AFL++
  • libFuzzer
  • Honggfuzz

Sanitizers

  • ASan
  • UBSan
  • MSan

Harness scaffold

#include <stdint.h>
#include <stddef.h>
#include <stdlib.h>
#include <lzma.h>

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
  lzma_stream strm = LZMA_STREAM_INIT;
  if (lzma_auto_decoder(&strm, 64 * 1024 * 1024, 0) != LZMA_OK) return 0;
  strm.next_in  = data;
  strm.avail_in = size;
  uint8_t out[65536];
  lzma_ret ret;
  do {
    strm.next_out  = out;
    strm.avail_out = sizeof(out);
    ret = lzma_code(&strm, LZMA_FINISH);
  } while (ret == LZMA_OK && strm.avail_in > 0);
  lzma_end(&strm);
  return 0;
}

Save this as fuzz_target.cc, build with your compiler + sanitizer flags, and you have a working starting point.

Notable CVEs found by fuzzing

  • CVE-2024-3094
Start fuzzing xz-utils on Fuzze.rs →

Push the harness above + a Dockerfile. First month 50% off.