Configuration commands
term-llm config
term-llm config edit
term-llm config path
term-llm config get default_provider
term-llm config set default_provider zen
term-llm config reset
The main config file lives at:
~/.config/term-llm/config.yaml
Configuration shape
A typical config has a few major parts:
default_providerfor the global LLM defaultprovidersfor model-specific credentials and routing- per-command blocks such as
exec,ask, andedit - feature-specific blocks such as
image,audio,music,embed,search,sessions,tools, andskills
Example
default_provider: anthropic
providers:
anthropic:
model: claude-sonnet-4-6
openai:
model: gpt-5.2
credentials: codex
# WebSocket transport is enabled by default for built-in OpenAI.
# Set false to force HTTP/SSE.
use_websocket: true
xai:
model: grok-4-1-fast
claude-bin:
model: opus
env:
IS_SANDBOX: "1"
openrouter:
model: x-ai/grok-code-fast-1
app_url: https://github.com/samsaffron/term-llm
app_title: term-llm
exec:
suggestions: 3
instructions: |
I use Arch Linux with zsh.
I prefer ripgrep over grep and fd over find.
ask:
max_turns: 50
instructions: |
Be concise. I'm an experienced developer.
chat:
max_turns: 200
edit:
model: gpt-5.2-codex
diff_format: auto
search:
provider: duckduckgo
tools:
max_tool_output_chars: 20000
Per-command overrides
Each command can override provider and model independently of the global default.
default_provider: anthropic
providers:
anthropic:
model: claude-sonnet-4-6
openai:
model: gpt-5.2
zen:
model: glm-4.7-free
exec:
provider: zen
model: glm-4.7-free
ask:
model: claude-opus-4
edit:
provider: openai
model: gpt-5.2-codex
Precedence is:
- CLI flag such as
--provider openai:gpt-5.2 - per-command config such as
exec.providerorask.model - global provider selection via
default_providerandproviders.<name>.model
Agentic turn limits
Agentic commands can make multiple provider calls while they execute tools and feed results back to the model. max_turns caps that loop.
Defaults:
ask.max_turns:50execCLI flag default:50chat.max_turns:200- Agent YAML
max_turnsoverrides command/config defaults when an agent is selected. - A CLI
--max-turns Nflag overrides both config and agent YAML.
ask:
max_turns: 50
chat:
max_turns: 200
Parallel tool execution
Models may request many independent tool calls in a single turn, such as several read_file, grep, or glob calls. term-llm executes independent tool calls concurrently when parallel tool calls are enabled by the provider/request, but caps one model turn at 20 concurrently running tool calls. Additional tool calls from the same turn are queued and run as earlier calls finish.
This is a built-in safety limit rather than a config option today. It preserves useful batching while preventing a single response from spawning an unbounded number of shells, greps, reads, or subagents at once.
Sessions config
sessions:
enabled: true
max_age_days: 0
max_count: 0
path: ""
Use this to control whether sessions are persisted, how long they are kept, and where the SQLite database lives.
Search config
search:
provider: perplexity
force_external: false
perplexity:
api_key: ${PERPLEXITY_API_KEY}
exa:
api_key: ${EXA_API_KEY}
brave:
api_key: ${BRAVE_API_KEY}
Search is large enough to deserve its own page; see Search.
Image, audio, music, transcription, and embedding config
image:
provider: gemini
output_dir: ~/Pictures/term-llm
audio:
provider: venice
output_dir: ~/Music/term-llm
venice:
api_key: ${VENICE_API_KEY}
model: tts-kokoro
voice: af_sky
format: mp3
music:
provider: venice
output_dir: ~/Music/term-llm
venice:
api_key: ${VENICE_API_KEY}
model: elevenlabs-sound-effects-v2
format: mp3
elevenlabs:
api_key: ${ELEVENLABS_API_KEY}
model: music_v1
format: mp3_44100_128
transcription:
provider: venice
venice:
api_key: ${VENICE_API_KEY}
model: nvidia/parakeet-tdt-0.6b-v3
elevenlabs:
api_key: ${ELEVENLABS_API_KEY}
model: scribe_v2
embed:
provider: gemini
Each feature block can hold provider-specific credentials and defaults. The image, audio, music, transcription, and embedding providers are independent of the main text provider.
Provider-specific environment overrides
Providers that shell out to local CLIs can accept extra subprocess environment variables via providers.<name>.env.
For claude-bin, term-llm also disables Claude Code hooks by default so user-level Claude automation does not leak into inference sessions. Set providers.claude-bin.enable_hooks: true if you explicitly want Claude Code hooks to run.
Example for Claude Code when term-llm runs inside a trusted sandboxed container:
providers:
claude-bin:
model: opus
env:
IS_SANDBOX: "1"
# Generate a long-lived token with: claude setup-token
# Useful in CI or headless environments where interactive login isn't possible
CLAUDE_CODE_OAUTH_TOKEN: "your-oauth-token-here"
# Optional: re-enable Claude Code hooks for this provider
# enable_hooks: true
providers.<name>.env values support the same resolution rules as other deferred config values:
file://path→ trimmed file contentsfile://path#json.path→ JSON field extracted from the fileop://...→ 1Password secret lookup$()→ command output${VAR}/$VAR→ environment variable expansion
This is passed only to the provider subprocess. It does not mutate your parent shell environment.
Provider WebSocket transport
Built-in openai and chatgpt text providers use the Responses WebSocket transport by default for lower-latency agent/tool loops. The WebSocket path keeps a persistent connection and, when safe, continues turns with previous_response_id plus only the new user/tool input. If the WebSocket connect/write step fails, term-llm falls back to HTTP/SSE; if a WebSocket continuation is rejected because the prior response state is unavailable, it retries that turn once with full state.
Disable it per provider if you need to force HTTP/SSE:
providers:
openai:
use_websocket: false
chatgpt:
use_websocket: false
OpenAI-compatible providers (type: openai_compatible, including local/self-hosted endpoints and OpenRouter-style compatible APIs) do not enable WebSockets by default. They continue to use HTTP/SSE unless explicitly supported and wired by that provider.
Dynamic secrets and endpoints
term-llm supports dynamic resolution for some config values:
op://...for 1Password secret referencessrv://...for DNS SRV-based endpoint discovery$()for command-based resolution
Example:
providers:
production-llm:
type: openai_compatible
model: Qwen/Qwen3-30B-A3B
url: "srv://_vllm._tcp.ml.company.com/v1/chat/completions"
api_key: "op://Infrastructure/vLLM Cluster/credential?account=company.1password.com"
These values are resolved lazily when term-llm actually needs them.
WebRTC direct routing config
serve:
webrtc:
enabled: true
signaling_url: https://signal.example.com/webrtc
token: your-signaling-token
stun_urls:
- stun:stun.l.google.com:19302
max_conns: 10
These values match the --webrtc-* CLI flags. See the WebRTC direct routing guide for full details.
Skills config
skills:
enabled: true
auto_invoke: true
metadata_budget_tokens: 8000
max_visible_skills: 50
include_project_skills: true
include_ecosystem_paths: true
always_enabled: [git, code-review]
never_auto: [expensive-api-skill]
Controls the skills system: portable instruction bundles that inject task-specific context into the system prompt. Skills are disabled by default; set enabled: true to allow auto-invocation, or use --skills on any command for one-off activation. See Skills for the full guide.
Diagnostics
diagnostics:
enabled: true
When edit retries fail, diagnostics can capture prompts, partial responses, and failure context for inspection.