Documentation Index
Fetch the complete documentation index at: https://docs.kong.fyi/llms.txt
Use this file to discover all available pages before exploring further.
Kong supports three LLM providers out of the box:
| Provider | Default Model | Setup |
|---|
| Anthropic | claude-opus-4-6 | Set ANTHROPIC_API_KEY env var |
| OpenAI | gpt-4o | Set OPENAI_API_KEY env var |
| Custom | User-specified | Configure via kong setup or --base-url flag |
Anthropic (Claude)
Default model: claude-opus-4-6
Setup:
export ANTHROPIC_API_KEY="sk-ant-..."
Get your key at console.anthropic.com.
Available models:
| Model | Best for |
|---|
claude-opus-4-6 | Highest quality analysis (default) |
claude-sonnet-4-6 | Balanced quality and cost |
claude-haiku-4-5-20251001 | Fastest, lowest cost |
Pricing (per 1M tokens):
| Model | Input | Output | Cache Write | Cache Read |
|---|
| claude-opus-4-6 | $5.00 | $25.00 | $6.25 | $0.50 |
| claude-sonnet-4-6 | $3.00 | $15.00 | $3.75 | $0.30 |
| claude-haiku-4-5 | $1.00 | $5.00 | $1.25 | $0.10 |
OpenAI (GPT-4o)
Default model: gpt-4o
Setup:
export OPENAI_API_KEY="sk-..."
Get your key at platform.openai.com.
Available models:
| Model | Best for |
|---|
gpt-4o | High quality analysis (default) |
gpt-4o-mini | Fast, low cost |
o1 | Complex reasoning (slower, higher cost) |
o3-mini | Advanced reasoning (balanced) |
Pricing (per 1M tokens):
| Model | Input | Output | Cached Input |
|---|
| gpt-4o | $2.50 | $10.00 | $1.25 |
| gpt-4o-mini | $0.15 | $0.60 | $0.075 |
| o1 | $15.00 | $60.00 | $7.50 |
| o3-mini | $1.10 | $4.40 | $0.55 |
Smart routing
If you’ve configured multiple providers via kong setup, Kong uses your default provider. Override at runtime:
kong analyze ./binary --provider openai --model gpt-4o-mini
If your default provider’s API key isn’t set, Kong falls back to any other configured provider that has a valid key.
Rate limiting
Kong uses automatic token-bucket rate limiting to stay within API rate limits. No configuration needed — it proactively sleeps between requests when approaching the limit. This prevents 429 Too Many Requests errors during large analyses.
Further reading