LLM config (LiteLLM)
KubeClaw deploys a LiteLLM proxy by default and injects OPENAI_API_BASE into the Gateway container so all LLM SDK calls route through the proxy. This gives you per-agent virtual keys, budget caps, model fallback routing, and semantic caching behind a single endpoint.
Core settings
| Key | Default | Description |
|---|---|---|
litellm.enabled | true | Deploy LiteLLM proxy |
litellm.image.tag | main-v1.61.0 | LiteLLM container image tag |
litellm.masterkey | "" | Required. Must start with sk- |
litellm.masterkeySecretName | "" | Reference an existing Secret instead |
litellm.masterkeySecretKey | masterkey | Key within the referenced Secret |
litellm.replicaCount | 1 | Number of proxy replicas |
litellm.environmentSecrets | [kubeclaw-litellm-env] | Secrets mounted as env vars on the proxy pod |
Proxy config
The litellm.proxy_config value maps directly to LiteLLM's config.yaml. This is where you define models, routing, and general settings.
Default proxy_config:
litellm:
proxy_config:
model_list:
- model_name: "gpt-4o"
litellm_params:
model: "gpt-4o"
api_key: "os.environ/OPENAI_API_KEY"
litellm_settings:
drop_params: true
router_settings:
routing_strategy: "simple-shuffle"
num_retries: 2
timeout: 120
general_settings:
master_key: "os.environ/PROXY_MASTER_KEY"
Adding models
Add more models by extending model_list. Provider API keys from secret.data are forwarded to the proxy pod via environmentSecrets.
litellm:
proxy_config:
model_list:
- model_name: "gpt-4o"
litellm_params:
model: "gpt-4o"
api_key: "os.environ/OPENAI_API_KEY"
- model_name: "claude-sonnet"
litellm_params:
model: "anthropic/claude-sonnet-4-20250514"
api_key: "os.environ/ANTHROPIC_API_KEY"
Then add the key to your secret:
secret:
data:
ANTHROPIC_API_KEY: "sk-ant-..."
Routing strategies
LiteLLM supports several routing strategies via router_settings.routing_strategy:
simple-shuffle(default): random selection across healthy endpointsleast-busy: routes to the model instance with the fewest in-flight requestslatency-based-routing: picks the fastest responding endpointcost-based-routing: picks the cheapest available model
Subcharts
| Key | Default | Description |
|---|---|---|
litellm.db.deployStandalone | false | Deploy PostgreSQL for virtual keys / budget tracking |
litellm.redis.enabled | true | Deploy Redis for semantic caching |
litellm.migrationJob.enabled | false | Database migration job (requires PostgreSQL) |
Enable PostgreSQL when you need virtual keys, spend tracking, or team-based access controls. Redis is enabled by default for response caching.