Configuration

On this page

Configuration commands

term-llm config
term-llm config edit
term-llm config path
term-llm config get default_provider
term-llm config set default_provider zen
term-llm config reset

The main config file lives at:

~/.config/term-llm/config.yaml

Configuration shape

A typical config has a few major parts:

default_provider for the global LLM default
providers for model-specific credentials and routing
per-command blocks such as exec, ask, and edit
feature-specific blocks such as image, audio, music, embed, search, sessions, tools, and skills

Example

default_provider: anthropic

providers:
  anthropic:
    model: claude-sonnet-4-6

  openai:
    model: gpt-5.2
    credentials: codex
    # WebSocket transport is enabled by default for built-in OpenAI.
    # Set false to force HTTP/SSE.
    use_websocket: true

  xai:
    model: grok-4-1-fast

  claude-bin:
    model: opus
    env:
      IS_SANDBOX: "1"

  openrouter:
    model: x-ai/grok-code-fast-1
    app_url: https://github.com/samsaffron/term-llm
    app_title: term-llm

exec:
  suggestions: 3
  instructions: |
    I use Arch Linux with zsh.
    I prefer ripgrep over grep and fd over find.

ask:
  max_turns: 50
  instructions: |
    Be concise. I'm an experienced developer.

chat:
  max_turns: 200

edit:
  model: gpt-5.2-codex
  diff_format: auto

search:
  provider: duckduckgo

tools:
  max_tool_output_chars: 20000

Per-command overrides

Each command can override provider and model independently of the global default.

default_provider: anthropic

providers:
  anthropic:
    model: claude-sonnet-4-6
  openai:
    model: gpt-5.2
  zen:
    model: glm-4.7-free

exec:
  provider: zen
  model: glm-4.7-free

ask:
  model: claude-opus-4

edit:
  provider: openai
  model: gpt-5.2-codex

Precedence is:

CLI flag such as --provider openai:gpt-5.2
per-command config such as exec.provider or ask.model
global provider selection via default_provider and providers.<name>.model

Agentic turn limits

Agentic commands can make multiple provider calls while they execute tools and feed results back to the model. max_turns caps that loop.

Defaults:

ask.max_turns: 50
exec CLI flag default: 50
chat.max_turns: 200
Agent YAML max_turns overrides command/config defaults when an agent is selected.
A CLI --max-turns N flag overrides both config and agent YAML.

ask:
  max_turns: 50

chat:
  max_turns: 200

Parallel tool execution

Models may request many independent tool calls in a single turn, such as several read_file, grep, or glob calls. term-llm executes independent tool calls concurrently when parallel tool calls are enabled by the provider/request, but caps one model turn at 20 concurrently running tool calls. Additional tool calls from the same turn are queued and run as earlier calls finish.

This is a built-in safety limit rather than a config option today. It preserves useful batching while preventing a single response from spawning an unbounded number of shells, greps, reads, or subagents at once.

Sessions config

sessions:
  enabled: true
  max_age_days: 0
  max_count: 0
  path: ""

Use this to control whether sessions are persisted, how long they are kept, and where the SQLite database lives.

Search config

search:
  provider: perplexity
  force_external: false

  perplexity:
    api_key: ${PERPLEXITY_API_KEY}

  exa:
    api_key: ${EXA_API_KEY}

  brave:
    api_key: ${BRAVE_API_KEY}

Search is large enough to deserve its own page; see Search.

Image, audio, music, transcription, and embedding config

image:
  provider: gemini
  output_dir: ~/Pictures/term-llm

audio:
  provider: venice
  output_dir: ~/Music/term-llm
  venice:
    api_key: ${VENICE_API_KEY}
    model: tts-kokoro
    voice: af_sky
    format: mp3

music:
  provider: venice
  output_dir: ~/Music/term-llm
  venice:
    api_key: ${VENICE_API_KEY}
    model: elevenlabs-sound-effects-v2
    format: mp3
  elevenlabs:
    api_key: ${ELEVENLABS_API_KEY}
    model: music_v1
    format: mp3_44100_128

transcription:
  provider: venice
  venice:
    api_key: ${VENICE_API_KEY}
    model: nvidia/parakeet-tdt-0.6b-v3
  elevenlabs:
    api_key: ${ELEVENLABS_API_KEY}
    model: scribe_v2

embed:
  provider: gemini

Each feature block can hold provider-specific credentials and defaults. The image, audio, music, transcription, and embedding providers are independent of the main text provider.

Provider-specific environment overrides

Providers that shell out to local CLIs can accept extra subprocess environment variables via providers.<name>.env.

For claude-bin, term-llm also disables Claude Code hooks by default so user-level Claude automation does not leak into inference sessions. Set providers.claude-bin.enable_hooks: true if you explicitly want Claude Code hooks to run.

Example for Claude Code when term-llm runs inside a trusted sandboxed container:

providers:
  claude-bin:
    model: opus
    env:
      IS_SANDBOX: "1"
      # Generate a long-lived token with: claude setup-token
      # Useful in CI or headless environments where interactive login isn't possible
      CLAUDE_CODE_OAUTH_TOKEN: "your-oauth-token-here"
    # Optional: re-enable Claude Code hooks for this provider
    # enable_hooks: true

providers.<name>.env values support the same resolution rules as other deferred config values:

file://path → trimmed file contents
file://path#json.path → JSON field extracted from the file
op://... → 1Password secret lookup
$() → command output
${VAR} / $VAR → environment variable expansion

This is passed only to the provider subprocess. It does not mutate your parent shell environment.

Provider WebSocket transport

Built-in openai and chatgpt text providers use the Responses WebSocket transport by default for lower-latency agent/tool loops. The WebSocket path keeps a persistent connection and, when safe, continues turns with previous_response_id plus only the new user/tool input. If the WebSocket connect/write step fails, term-llm falls back to HTTP/SSE; if a WebSocket continuation is rejected because the prior response state is unavailable, it retries that turn once with full state.

Disable it per provider if you need to force HTTP/SSE:

providers:
  openai:
    use_websocket: false
  chatgpt:
    use_websocket: false

OpenAI-compatible providers (type: openai_compatible, including local/self-hosted endpoints and OpenRouter-style compatible APIs) do not enable WebSockets by default. They continue to use HTTP/SSE unless explicitly supported and wired by that provider.

Dynamic secrets and endpoints

term-llm supports dynamic resolution for some config values:

op://... for 1Password secret references
srv://... for DNS SRV-based endpoint discovery
$() for command-based resolution

Example:

providers:
  production-llm:
    type: openai_compatible
    model: Qwen/Qwen3-30B-A3B
    url: "srv://_vllm._tcp.ml.company.com/v1/chat/completions"
    api_key: "op://Infrastructure/vLLM Cluster/credential?account=company.1password.com"

These values are resolved lazily when term-llm actually needs them.

WebRTC direct routing config

serve:
  webrtc:
    enabled: true
    signaling_url: https://signal.example.com/webrtc
    token: your-signaling-token
    stun_urls:
      - stun:stun.l.google.com:19302
    max_conns: 10

These values match the --webrtc-* CLI flags. See the WebRTC direct routing guide for full details.

Skills config

skills:
  enabled: true
  auto_invoke: true
  metadata_budget_tokens: 8000
  max_visible_skills: 50
  include_project_skills: true
  include_ecosystem_paths: true
  always_enabled: [git, code-review]
  never_auto: [expensive-api-skill]

Controls the skills system: portable instruction bundles that inject task-specific context into the system prompt. Skills are disabled by default; set enabled: true to allow auto-invocation, or use --skills on any command for one-off activation. See Skills for the full guide.

Diagnostics

diagnostics:
  enabled: true

When edit retries fail, diagnostics can capture prompts, partial responses, and failure context for inspection.