Skip to main content
Kong supports three LLM providers out of the box:
ProviderDefault ModelSetup
Anthropicclaude-opus-4-6Set ANTHROPIC_API_KEY env var
OpenAIgpt-4oSet OPENAI_API_KEY env var
CustomUser-specifiedConfigure via kong setup or --base-url flag

Anthropic (Claude)

Default model: claude-opus-4-6 Setup:
export ANTHROPIC_API_KEY="sk-ant-..."
Get your key at console.anthropic.com. Available models:
ModelBest for
claude-opus-4-6Highest quality analysis (default)
claude-sonnet-4-6Balanced quality and cost
claude-haiku-4-5-20251001Fastest, lowest cost
Pricing (per 1M tokens):
ModelInputOutputCache WriteCache Read
claude-opus-4-6$5.00$25.00$6.25$0.50
claude-sonnet-4-6$3.00$15.00$3.75$0.30
claude-haiku-4-5$1.00$5.00$1.25$0.10

OpenAI (GPT-4o)

Default model: gpt-4o Setup:
export OPENAI_API_KEY="sk-..."
Get your key at platform.openai.com. Available models:
ModelBest for
gpt-4oHigh quality analysis (default)
gpt-4o-miniFast, low cost
o1Complex reasoning (slower, higher cost)
o3-miniAdvanced reasoning (balanced)
Pricing (per 1M tokens):
ModelInputOutputCached Input
gpt-4o$2.50$10.00$1.25
gpt-4o-mini$0.15$0.60$0.075
o1$15.00$60.00$7.50
o3-mini$1.10$4.40$0.55

Smart routing

If you’ve configured multiple providers via kong setup, Kong uses your default provider. Override at runtime:
kong analyze ./binary --provider openai --model gpt-4o-mini
If your default provider’s API key isn’t set, Kong falls back to any other configured provider that has a valid key.

Rate limiting

Kong uses automatic token-bucket rate limiting to stay within API rate limits. No configuration needed — it proactively sleeps between requests when approaching the limit. This prevents 429 Too Many Requests errors during large analyses.

Further reading

Last modified on March 20, 2026