The single biggest factor in Kong’s analysis quality is not which LLM you use — it is what you put in the prompt. Kong builds a structured context window for every function before sending it to the model. This page explains what that window contains and why each piece matters.Documentation Index
Fetch the complete documentation index at: https://docs.kong.fyi/llms.txt
Use this file to discover all available pages before exploring further.
The Naive Approach (and Why It Fails)
The simplest possible approach to LLM-assisted reverse engineering is to paste a function’s decompiled output directly into a prompt and ask “what does this do?”param_1 is a request buffer, that param_2 is a max length, or that FUN_00401b50 parses an HTTP header.
Kong fixes this by assembling rich context from Ghidra’s program database before the prompt is built.
What Goes Into a Context Window
Every context window includes up to six categories of information, assembled by theAnalyzer._build_context method.
1. Normalized Decompilation
The target function’s decompiled C code, run through Kong’s syntactic normalizer to strip Ghidra artifacts and produce cleaner output.2. Cross-References (Callers and Callees)
Kong fetches decompilation snippets (first 10 lines) for both callers and callees of the target function:- Callees (up to 5) — functions called by the target. If a callee has already been analyzed, its recovered name appears in the snippet.
- Callers (up to 3) — functions that call the target. These provide usage context.
FUN_00401a30, the callee FUN_00401b50 may already have been renamed to parse_http_header. The LLM sees:
FUN_00401b50(param_1, param_2), the model now knows the target function is calling an HTTP header parser. That single piece of context can determine the entire analysis.
3. String References
Kong resolves data cross-references from the function to Ghidra’s string table:"AES", "encrypt", and "key" is almost certainly cryptographic. A function that references "malloc failed" is doing memory allocation with error handling.
4. Already-Identified Functions
A map of all functions that Kong has already named in earlier analysis passes:init_connection, parse_request_line, and parse_http_header tells the model this is an HTTP server — which influences how it interprets ambiguous functions.
5. Known Struct Types
Struct definitions recovered from earlier function analyses:6. Binary Metadata
Architecture, format, and compiler information for the binary:MIPS binary compiled with a different toolchain will have different idioms than an x86_64 ELF.
How the Prompt Is Assembled
TheAnalyzer._build_prompt method concatenates these sections in order:
- Binary metadata (arch, format, compiler)
- Target function header (name, address, size)
- Normalized decompilation
- Referenced strings
- Called functions (callee snippets)
- Calling functions (caller snippets)
- Already-identified function list
- Known struct types with field layouts
Why This Matters
The difference between naive and context-enriched prompting is dramatic:| Approach | What the LLM sees for a callee | Naming accuracy |
|---|---|---|
| Naive | FUN_00401b50(param_1, param_2) | Low — the model guesses based on structure alone |
| Kong | parse_http_header(request_buffer, max_length) | High — the model recognizes the calling pattern |

