Documentation Index
Fetch the complete documentation index at: https://docs.kong.fyi/llms.txt
Use this file to discover all available pages before exploring further.
The Problem: Lost Type Information
Stripped binaries have no type information. When the original source had a struct like this:Phase 1: Struct Proposals
During the main analysis phase, each function is sent to the LLM with a rich context window. When the LLM sees a function accessing a pointer parameter at multiple offsets, it proposes a struct definition. A single function might produce a proposal like:StructAccumulator collects these proposals as functions are analyzed.
Multiple functions often access the same struct. A send_response function might produce its own proposal with different fields at different offsets — but for the same underlying type. These overlapping proposals are the raw material for the merge phase.
Phase 2: Merging Proposals
After all functions have been analyzed, the accumulator’sunify() method merges proposals that describe the same struct.
Grouping
Proposals are grouped by name. When the LLM independently names a structConnectionState in two different functions, those proposals land in the same group.
Merging Fields
Within a group, fields are merged by offset. When multiple proposals define a field at the same offset, the merge picks the best candidate using a scoring system:- Type specificity wins. A field typed
char *beats one typedundefined8. The generic typesundefined,undefined4, andundefined8score lowest. - Name descriptiveness wins. A field named
socket_fdbeats one namedfield_0x0. Generic names likefield,unk,undefined, andpadscore lowest.
Name Selection
The struct name that appears most frequently across proposals wins. If three proposals call itConnectionState and one calls it ConnState, the merged struct is named ConnectionState.
Size Calculation
The total struct size is the maximum of all proposed sizes and the end offset of the last merged field, ensuring no fields are lost to truncation.Phase 3: Ghidra Type Creation
Once structs are unified, Kong creates them in Ghidra’s type system usingcreate_struct. Each UnifiedStruct becomes a real Ghidra data type that Ghidra’s decompiler can use to improve its output.
Kong also applies struct types to function parameters. If a proposal was tagged with used_by_param: "param_1" from function 0x00401a30, Kong resolves which parameter ordinal param_1 corresponds to and sets its type to a pointer to the new struct.
Error Handling
Type creation in Ghidra can fail — name collisions, invalid sizes, and other edge cases. Kong handles these gracefully: if a struct fails to create, it logs a warning and continues with the remaining structs. If a parameter type application fails, it logs and moves on. No single type failure blocks the rest of the pipeline.What Re-Analysis Gets You
Theapply_unified_structs function returns a list of function addresses whose parameters were retyped. These functions are candidates for re-analysis — with the struct types now applied, Ghidra’s decompiler produces cleaner output with named field accesses instead of raw offsets, which in turn gives the LLM better input for a second pass.
Before type application:
Related
- Semantic Synthesis — the global unification pass that can synthesize additional structs from cross-function patterns
- Context Windows — how per-function context is built before LLM analysis
- Pipeline Overview — where type recovery fits in the overall analysis pipeline

