| Provider | Default Model | Setup |
|---|---|---|
| Anthropic | claude-opus-4-6 | Set ANTHROPIC_API_KEY env var |
| OpenAI | gpt-4o | Set OPENAI_API_KEY env var |
| Custom | User-specified | Configure via kong setup or --base-url flag |
Anthropic (Claude)
Default model:claude-opus-4-6
Setup:
| Model | Best for |
|---|---|
claude-opus-4-6 | Highest quality analysis (default) |
claude-sonnet-4-6 | Balanced quality and cost |
claude-haiku-4-5-20251001 | Fastest, lowest cost |
| Model | Input | Output | Cache Write | Cache Read |
|---|---|---|---|---|
| claude-opus-4-6 | $5.00 | $25.00 | $6.25 | $0.50 |
| claude-sonnet-4-6 | $3.00 | $15.00 | $3.75 | $0.30 |
| claude-haiku-4-5 | $1.00 | $5.00 | $1.25 | $0.10 |
OpenAI (GPT-4o)
Default model:gpt-4o
Setup:
| Model | Best for |
|---|---|
gpt-4o | High quality analysis (default) |
gpt-4o-mini | Fast, low cost |
o1 | Complex reasoning (slower, higher cost) |
o3-mini | Advanced reasoning (balanced) |
| Model | Input | Output | Cached Input |
|---|---|---|---|
| gpt-4o | $2.50 | $10.00 | $1.25 |
| gpt-4o-mini | $0.15 | $0.60 | $0.075 |
| o1 | $15.00 | $60.00 | $7.50 |
| o3-mini | $1.10 | $4.40 | $0.55 |
Smart routing
If you’ve configured multiple providers viakong setup, Kong uses your default provider. Override at runtime:
Rate limiting
Kong uses automatic token-bucket rate limiting to stay within API rate limits. No configuration needed — it proactively sleeps between requests when approaching the limit. This prevents429 Too Many Requests errors during large analyses.
Further reading
- Custom Endpoints — use local or third-party models
- LLM Models & Pricing — full pricing and limits reference
- Setup Wizard — configure providers interactively

