Usage
kong eval compares Kong’s output against source code with known function names and signatures. It’s useful for benchmarking Kong’s accuracy on binaries where you have the original source.
Example output
Scoring indicators
| Indicator | Meaning |
|---|---|
| OK | Symbol accuracy ≥ 0.8 — good match |
| ~~ | Symbol accuracy > 0 but < 0.8 — partial match |
| NO | No match found |
Symbol accuracy
Kong uses recall-weighted word matching with synonym expansion to score function names. Names are split into word tokens (by underscores and camelCase), then scored:| Score | Condition |
|---|---|
| 1.0 | Exact string match |
| 0.9 | Same word set, different order |
| 0.8 | All ground-truth words present (superset match, including synonyms) |
| recall × 0.7 | Partial overlap, scaled by how many truth words are covered |
| 0.0 | No overlap, even after synonym expansion |
search/find/lookup, create/make/new/alloc, delete/remove, buffer/buf, and others. So find_node matching against search_element would get synonym credit.
Type accuracy
Type signatures are scored by weighted component matching:| Component | Weight | How it’s scored |
|---|---|---|
| Return type | 40% | Exact match after normalizing Ghidra aliases (undefined4 → int) |
| Parameter count | 30% | Exact match on number of parameters |
| Parameter types | 30% | Per-parameter type match, averaged |
Further reading
- CLI Reference:
kong eval— full flag reference - Interpreting Kong Output — understanding analysis results

