Skip to main content
Kong — the world's first AI reverse engineer

The Problem: Stripped Binaries

When developers compile source code into a binary, the compiler throws away almost everything that made the code readable.
Reverse engineering is the process of analyzing a compiled binary to understand what it does — without having access to the original source code. Security researchers, malware analysts, and CTF players do this regularly.
In the original source, a function might be called parse_http_header. It has descriptive parameter names like request and buffer_size. The structs have meaningful field names. There are comments explaining edge cases.
A stripped binary is an executable where the compiler (or a post-processing tool like strip) has removed all debugging symbols — function names, variable names, type information, and struct layouts. What remains is raw machine code with auto-generated labels like FUN_00401a30.
After stripping, that same function becomes FUN_00401a30. Its parameters are param_1 and param_2. The structs are flattened into raw pointer offsets. Every function in the binary looks like this — hundreds or thousands of them, with no indication of what any of them do. Recovering that context is the bulk of the work in most reverse engineering tasks. And it is tedious. An experienced analyst might spend hours renaming functions, tracing data flow, and mentally reconstructing types. For a binary with a few hundred functions, that work can stretch into days.

How Kong Solves It

A decompiler is a tool that converts machine code back into a higher-level representation (like C code). It is not perfect — the output is often messy, with meaningless variable names and lost type information — but it gives analysts something to read instead of raw assembly.
Kong combines Ghidra (the NSA’s reverse engineering framework) with large language models (Claude and GPT-4o) to automate symbol recovery. A single command runs the full pipeline:
  1. Triage — enumerate every function, classify by complexity, build the call graph, and match known library signatures
  2. Analysis — process functions bottom-up from the call graph, building rich context windows from Ghidra’s program database before sending each function to the LLM
  3. Cleanup — normalize and deduplicate results
  4. Synthesis — unify naming conventions across the entire binary and synthesize struct definitions
  5. Export — write everything back to Ghidra and produce analysis.json
The key insight: LLMs are good at pattern matching — recognizing standard library functions, inferring types from usage, propagating names through call graphs. But pointing an LLM at raw decompiler output in isolation gives mediocre results. Kong solves this by building rich context windows (cross-references, string references, caller/callee signatures, data flow) and analyzing functions in dependency order so each function benefits from its callees already being named.

The Results

When Kong analyzed the XZ Utils backdoor — a real-world supply chain attack that made international news — it recovered function names, types, and structures from the fully stripped malicious binary in about 15 minutes for roughly $6.63 in API costs. That same analysis would take an experienced reverse engineer days of manual work. Kong transforms FUN_00401a30 into parse_http_header, recovers struct layouts, identifies cryptographic routines by signature, and writes everything back into Ghidra’s program database so you can continue your analysis with real names instead of auto-generated labels.

What’s Next?

Ready to try it? The Quickstart gets you from zero to your first analysis in about five minutes. Or if you want to understand the architecture first, start with the Pipeline Overview.
Last modified on March 20, 2026