Skip to main content

Usage

kong eval ./analysis.json ./source.c
kong eval compares Kong’s output against source code with known function names and signatures. It’s useful for benchmarking Kong’s accuracy on binaries where you have the original source.

Example output

Binary: binary
Functions: 156 analyzed / 200 in source
Symbol Accuracy: 87.3%
Type Accuracy: 72.1%

Per-Function Scores:
  OK parse_http_header             -> parse_http_header  sym=1.00  type=0.95
  OK handle_connection             -> handle_connection  sym=0.95  type=0.89
  ~~ process_data                  -> process_request    sym=0.60  type=0.45
  NO FUN_00401000                  -> (no match)         sym=0.00  type=0.00

Scoring indicators

IndicatorMeaning
OKSymbol accuracy ≥ 0.8 — good match
~~Symbol accuracy > 0 but < 0.8 — partial match
NONo match found

Symbol accuracy

Kong uses recall-weighted word matching with synonym expansion to score function names. Names are split into word tokens (by underscores and camelCase), then scored:
ScoreCondition
1.0Exact string match
0.9Same word set, different order
0.8All ground-truth words present (superset match, including synonyms)
recall × 0.7Partial overlap, scaled by how many truth words are covered
0.0No overlap, even after synonym expansion
Synonyms: Kong recognizes equivalent terms — search/find/lookup, create/make/new/alloc, delete/remove, buffer/buf, and others. So find_node matching against search_element would get synonym credit.

Type accuracy

Type signatures are scored by weighted component matching:
ComponentWeightHow it’s scored
Return type40%Exact match after normalizing Ghidra aliases (undefined4int)
Parameter count30%Exact match on number of parameters
Parameter types30%Per-parameter type match, averaged
Parameter names are not compared — only types. This reflects that parameter naming is subjective, while types are structural.

Further reading

Last modified on March 20, 2026