XZ Backdoor (CVE-2024-3094)

Background

In March 2024, a Microsoft engineer noticed SSH logins taking 500 milliseconds longer than usual. That observation unraveled one of the most sophisticated supply chain attacks in open-source history.

Supply chain attack — an attack that compromises software by targeting its dependencies or build process, rather than the software itself. The attacker poisons a library that many other projects rely on.

A malicious maintainer had spent two years building trust in the XZ Utils project — a compression library linked by virtually every Linux distribution’s SSH daemon. The backdoored liblzma.so.5.4.1 hijacked OpenSSH’s RSA signature verification to execute arbitrary commands on targeted systems. The malicious functions were hand-written to blend in with liblzma’s legitimate compression code. No obvious symbols, no suspicious strings, no telltale exports. This is the hardest class of reverse engineering problem: a real-world implant buried inside a large, legitimate codebase.

The challenge

Analyzing the stripped liblzma.so binary means:

No symbols — every function is FUN_XXXXXXXX
558 functions enumerated, 396 queued for analysis
Malicious code is ~1% of the binary — a needle in a haystack
Deliberately camouflaged — the backdoor mimics legitimate LZMA patterns

A skilled reverse engineer would typically spend days working through this binary manually. Finding the implant through static analysis — without symbols, without source, without knowing what to look for — is the kind of task that defines expert-level RE work.

Kong’s results

Kong analyzed the binary in 15 minutes for $6.63.

Metric	Value
Functions analyzed	355 / 396
High confidence (80%+)	308 (87%)
Medium confidence (60-79%)	33 (9%)
Low confidence (under 60%)	14 (4%)

Kill chain — fully reconstructed

Kong independently identified all five core backdoor functions and reconstructed the attack chain with no prior knowledge of CVE-2024-3094:

Function	Confidence	Role
`init_rsa_public_decrypt`	95%	Parses ELF dynamic symbols at load time to locate `RSA_public_decrypt`
`function_hook_replace`	90%	Overwrites the GOT entry — changes memory protection, swaps pointer, restores permissions
`rsa_public_decrypt_wrapper`	95%	The hook: intercepts RSA verification, checks for root + magic value, decrypts payload
`initialize_cipher_context`	92%	Sets up ChaCha20 state with 256-bit key and 96-bit nonce
`chacha20_encrypt`	95%	Decrypts shellcode embedded in RSA signature data

GOT hijacking — the Global Offset Table (GOT) stores addresses of dynamically linked functions. By overwriting a GOT entry, an attacker can redirect calls to a legitimate function (like RSA_public_decrypt) through their own code first.

Kong’s analysis of the hook function:

“XZ backdoor: intercepts RSA_public_decrypt. When running as root and magic matches, decrypts and executes shellcode via ChaCha20.”

Supporting infrastructure

Beyond the five core functions, Kong correctly identified the backdoor’s supporting infrastructure:

ELF dynamic section parsing — reading the binary’s own symbol table at runtime
/proc/self/maps reads — checking memory permissions before modifying the GOT
dladdr1-based symbol resolution — finding function addresses by name at runtime

All of these were correctly classified as part of the implant’s runtime hooking mechanism, not legitimate liblzma functionality.

Legitimate code

Kong also correctly recovered the full breadth of liblzma’s real functionality:

LZMA/LZMA2 encoders and decoders
Match finders and range coders
Streaming state machines
CRC32/CRC64 (including CLMUL-accelerated variants)
SHA-256
XZ container format handling
Branch-call-jump filters for x86, ARM64, and RISC-V

Five functions were flagged for potential control flow flattening. All were correctly identified as false positives in the reasoning — they’re legitimate 7-23 state resumption machines inherent to liblzma’s streaming API.

Why this matters

The XZ backdoor was discovered by a human noticing a timing anomaly. Finding the implant through static analysis of the stripped binary — without symbols, without source, without knowing what to look for — is the kind of task that traditionally takes an experienced reverse engineer days of manual work. Kong reconstructed the full kill chain autonomously in 15 minutes. This suggests a path toward automated triage of suspected supply chain compromises: point Kong at a suspicious binary and get a structured assessment of what it does, including code that shouldn’t be there.

Case Studies

XZ Backdoor (CVE-2024-3094)

Background

The challenge

Kong’s results

Kill chain — fully reconstructed

Supporting infrastructure

Legitimate code

Why this matters

Further reading

Case Studies

Documentation Index

​Background

​The challenge

​Kong’s results

​Kill chain — fully reconstructed

​Supporting infrastructure

​Legitimate code

​Why this matters

​Further reading

Background

The challenge

Kong’s results

Kill chain — fully reconstructed

Supporting infrastructure

Legitimate code

Why this matters

Further reading