What is Kong?

The Problem: Stripped Binaries

When developers compile source code into a binary, the compiler throws away almost everything that made the code readable.

Reverse engineering is the process of analyzing a compiled binary to understand what it does — without having access to the original source code. Security researchers, malware analysts, and CTF players do this regularly.

In the original source, a function might be called parse_http_header. It has descriptive parameter names like request and buffer_size. The structs have meaningful field names. There are comments explaining edge cases.

A stripped binary is an executable where the compiler (or a post-processing tool like strip) has removed all debugging symbols — function names, variable names, type information, and struct layouts. What remains is raw machine code with auto-generated labels like FUN_00401a30.

After stripping, that same function becomes FUN_00401a30. Its parameters are param_1 and param_2. The structs are flattened into raw pointer offsets. Every function in the binary looks like this — hundreds or thousands of them, with no indication of what any of them do. Recovering that context is the bulk of the work in most reverse engineering tasks. And it is tedious. An experienced analyst might spend hours renaming functions, tracing data flow, and mentally reconstructing types. For a binary with a few hundred functions, that work can stretch into days.

How Kong Solves It

A decompiler is a tool that converts machine code back into a higher-level representation (like C code). It is not perfect — the output is often messy, with meaningless variable names and lost type information — but it gives analysts something to read instead of raw assembly.

Kong combines Ghidra (the NSA’s reverse engineering framework) with large language models (Claude and GPT-4o) to automate symbol recovery. A single command runs the full pipeline:

Triage — enumerate every function, classify by complexity, build the call graph, and match known library signatures
Analysis — process functions bottom-up from the call graph, building rich context windows from Ghidra’s program database before sending each function to the LLM
Cleanup — normalize and deduplicate results
Synthesis — unify naming conventions across the entire binary and synthesize struct definitions
Export — write everything back to Ghidra and produce analysis.json

The key insight: LLMs are good at pattern matching — recognizing standard library functions, inferring types from usage, propagating names through call graphs. But pointing an LLM at raw decompiler output in isolation gives mediocre results. Kong solves this by building rich context windows (cross-references, string references, caller/callee signatures, data flow) and analyzing functions in dependency order so each function benefits from its callees already being named.

The Results

When Kong analyzed the XZ Utils backdoor — a real-world supply chain attack that made international news — it recovered function names, types, and structures from the fully stripped malicious binary in about 15 minutes for roughly $6.63 in API costs. That same analysis would take an experienced reverse engineer days of manual work. Kong transforms FUN_00401a30 into parse_http_header, recovers struct layouts, identifies cryptographic routines by signature, and writes everything back into Ghidra’s program database so you can continue your analysis with real names instead of auto-generated labels.

What’s Next?

Ready to try it? The Quickstart gets you from zero to your first analysis in about five minutes. Or if you want to understand the architecture first, start with the Pipeline Overview.

Getting Started

Core Concepts

Usage

Configuration

What is Kong?

The Problem: Stripped Binaries

How Kong Solves It

The Results

What’s Next?

Getting Started

Core Concepts

Usage

Configuration

Documentation Index

​The Problem: Stripped Binaries

​How Kong Solves It

​The Results

​What’s Next?

The Problem: Stripped Binaries

How Kong Solves It

The Results

What’s Next?