On first run, term-llm will prompt you to choose a provider (Anthropic, AWS Bedrock, OpenAI, ChatGPT, GitHub Copilot, xAI, Venice, OpenRouter, Gemini, Gemini CLI, Zen, Claude Code (claude-bin), Ollama, or LM Studio).
Option 1: Try it free with Zen
OpenCode Zen provides free access to multiple models. No API key required:
term-llm exec --provider zen "list files"
term-llm ask --provider zen "explain git rebase"
term-llm ask --provider zen:gpt-5-nano "quick question" # use specific model
Available free models: glm-4.7-free, minimax-m2.1-free, grok-code, big-pickle, gpt-5-nano
Or configure as default:
# In ~/.config/term-llm/config.yaml
default_provider: zen
Option 2: Use API key
Set your API key as an environment variable:
# For Anthropic
export ANTHROPIC_API_KEY=your-key
# For OpenAI
export OPENAI_API_KEY=your-key
# For xAI (Grok)
export XAI_API_KEY=your-key
# For Venice
export VENICE_API_KEY=your-key
# For OpenRouter
export OPENROUTER_API_KEY=your-key
# For Gemini
export GEMINI_API_KEY=your-key
# For Perplexity search (used by search.provider: perplexity)
export PERPLEXITY_API_KEY=your-key
Option 3: Use ChatGPT (Plus/Pro subscription)
If you have a ChatGPT Plus or Pro subscription, you can use the chatgpt provider with native OAuth authentication for both text and image workflows:
term-llm ask --provider chatgpt "explain this code"
term-llm ask --provider chatgpt:gpt-5.2-codex "code question"
term-llm image --provider chatgpt:gpt-5.4 "storybook fox in the snow"
On first use, you’ll be prompted to authenticate via browser. Credentials are stored locally and refreshed automatically.
# In ~/.config/term-llm/config.yaml
default_provider: chatgpt
providers:
chatgpt:
model: gpt-5.2-codex
# Enabled by default for ChatGPT text requests; set false to force HTTP/SSE.
use_websocket: true
Option 4: Use xAI (Grok)
xAI provides access to Grok models with native web search and X (Twitter) search capabilities.
# In ~/.config/term-llm/config.yaml
default_provider: xai
providers:
xai:
model: grok-4-1-fast # default model
Available models:
| Model | Context | Description |
|---|---|---|
grok-4-1-fast |
2M | Latest, best for tool calling (default) |
grok-4-1-fast-reasoning |
2M | With chain-of-thought reasoning |
grok-4-1-fast-non-reasoning |
2M | Faster, no reasoning overhead |
grok-4 |
256K | Base Grok 4 model |
grok-3 / grok-3-fast |
131K | Previous generation |
grok-3-mini / grok-3-mini-fast |
131K | Smaller, faster |
grok-code-fast-1 |
256K | Optimized for coding tasks |
Or use the --provider flag:
term-llm ask --provider xai "explain quantum computing"
term-llm ask --provider xai -s "latest xAI news" # uses native web + X search
term-llm ask --provider xai:grok-4-1-fast-reasoning "solve this step by step"
term-llm ask --provider xai:grok-code-fast-1 "review this code"
Option 5: Use Venice
Venice exposes a wide mix of hosted text models behind an OpenAI-compatible API, including Venice’s own uncensored models plus Claude, Gemini, Grok, Qwen, GLM, Kimi, DeepSeek, and more. term-llm also enables Venice native web search when you use -s / --search.
# In ~/.config/term-llm/config.yaml
default_provider: venice
providers:
venice:
model: venice-uncensored # default model
fast_model: llama-3.2-3b # lightweight control-plane model
Or use the --provider flag directly:
term-llm ask --provider venice "explain quantum computing"
term-llm ask --provider venice:grok-4-20-beta -s "latest xAI news"
term-llm ask --provider venice:qwen3-coder-480b-a35b-instruct "review this code"
term-llm models --provider venice
Option 6: Use OpenRouter
OpenRouter provides a unified OpenAI-compatible API across many models. term-llm sends attribution headers by default.
# In ~/.config/term-llm/config.yaml
default_provider: openrouter
providers:
openrouter:
model: x-ai/grok-code-fast-1
app_url: https://github.com/samsaffron/term-llm
app_title: term-llm
Model Discovery
List available models from any supported provider:
term-llm models --provider anthropic # List Anthropic models
term-llm models --provider openrouter # List OpenRouter models
term-llm models --provider ollama # List local Ollama models
term-llm models --provider lmstudio # List local LM Studio models
term-llm models --json # Output as JSON
Provider Discovery
List all available LLM providers and their configuration status:
term-llm providers # List all providers
term-llm providers --configured # Only show configured providers
term-llm providers --builtin # Only show built-in providers
term-llm providers anthropic # Show details for specific provider
term-llm providers --json # JSON output
Option 7: Use AWS Bedrock
AWS Bedrock provides access to Anthropic Claude models through your AWS account. This is useful for organizations that route AI usage through AWS billing, need VPC/PrivateLink access, or use application inference profiles for rate/cost management.
term-llm ask --provider bedrock "explain this code"
term-llm ask --provider bedrock:claude-opus-4-6-thinking "complex question"
Authentication uses the standard AWS credential chain. Configure credentials via any method the AWS SDK supports:
# Environment variables
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-west-2
Or configure fully in ~/.config/term-llm/config.yaml:
default_provider: bedrock
providers:
bedrock:
region: us-west-2
model: claude-sonnet-4-6-thinking
Explicit credentials (with 1Password, vaults, or $() command resolution):
providers:
bedrock:
region: us-west-2
access_key_id: $(op-cache read "op://Private/AWS Bedrock/AWS_ACCESS_KEY_ID")
secret_access_key: $(op-cache read "op://Private/AWS Bedrock/AWS_SECRET_ACCESS_KEY")
model: claude-sonnet-4-6-thinking
Application inference profiles: use model_map to alias friendly model names to Bedrock ARNs or model IDs. The -thinking and -1m suffixes work with mapped names:
providers:
bedrock:
region: us-west-2
access_key_id: $(op-cache read "op://Private/AWS Bedrock/AWS_ACCESS_KEY_ID")
secret_access_key: $(op-cache read "op://Private/AWS Bedrock/AWS_SECRET_ACCESS_KEY")
model: claude-sonnet-4-6-thinking
model_map:
claude-sonnet-4-6: arn:aws:bedrock:us-west-2:123456789:application-inference-profile/abc123
claude-opus-4-6: arn:aws:bedrock:us-west-2:123456789:application-inference-profile/def456
claude-haiku-4-5: arn:aws:bedrock:us-west-2:123456789:application-inference-profile/ghi789
With this config, --provider bedrock:claude-opus-4-6-1m-thinking strips the suffixes, resolves claude-opus-4-6 through model_map to the ARN, and enables adaptive thinking + 1M context.
Available models (same as direct Anthropic, translated to Bedrock IDs automatically):
| Model | Suffixes | Description |
|---|---|---|
claude-sonnet-4-6 |
-thinking, -1m |
Latest Sonnet |
claude-opus-4-6 |
-thinking, -1m |
Latest Opus |
claude-haiku-4-5 |
-thinking |
Fast, lightweight |
The geographic prefix (us., eu., ap.) is derived from your configured region. For example, eu-west-1 produces eu.anthropic.* IDs, etc. This ensures data residency matches your region.
You can also pass raw Bedrock model IDs directly (e.g., us.anthropic.claude-sonnet-4-6 or full ARNs). These bypass the translation layer.
Features: full parity with the direct Anthropic provider: streaming, tool use, extended thinking, 1M context, images, prompt caching, web search/fetch all work through Bedrock.
| Config field | Description |
|---|---|
region |
AWS region. Falls back to AWS_REGION / AWS_DEFAULT_REGION env vars, then us-east-1. |
profile |
AWS profile name from ~/.aws/credentials. |
access_key_id |
Explicit AWS access key. Supports $(), op://, ${ENV} resolution. |
secret_access_key |
Explicit AWS secret key. Same resolution support. |
session_token |
Optional session token for temporary/assumed-role credentials. |
model_map |
Map of friendly names to Bedrock model IDs or ARNs. |
model |
Default model (friendly name, Bedrock ID, or ARN). |
Option 8: Use local LLMs (Ollama, LM Studio)
Run models locally with Ollama or LM Studio:
# List available models from your local server
term-llm models --provider ollama
term-llm models --provider lmstudio
# Configure in ~/.config/term-llm/config.yaml
default_provider: ollama
providers:
ollama:
type: openai_compatible
base_url: http://localhost:11434/v1
model: llama3.2:latest
lmstudio:
type: openai_compatible
base_url: http://localhost:1234/v1
model: deepseek-coder-v2
For other OpenAI-compatible servers (vLLM, text-generation-inference, etc.):
providers:
my-server:
type: openai_compatible
base_url: http://your-server:8080/v1
model: mixtral-8x7b
models: # optional: list models for shell autocomplete
- mixtral-8x7b
- llama-3-70b
The models list enables tab completion for --provider my-server:<TAB>. The configured model is always included in completions.
Built-in openai and chatgpt text providers use Responses WebSockets by default for faster tool-heavy conversations. openai_compatible providers do not: local/self-hosted compatible APIs stay on HTTP/SSE by default.
If your server rejects stream_options (causing errors on connect), disable it:
providers:
my-server:
type: openai_compatible
base_url: http://your-server:8080/v1
model: my-model
no_stream_options: true
For custom models not in the built-in token limit tables, set context_window and max_output_tokens explicitly:
providers:
my-server:
type: openai_compatible
base_url: http://your-server:8080/v1
model: my-model
context_window: 32768
max_output_tokens: 8192
See Providers and models for the full list of OpenAI-compatible provider options.
Option 9: Use Claude Code (claude-bin)
If you have Claude Code installed and logged in, you can use the claude-bin provider to run completions via the Claude Agent SDK. This requires no API key - it uses Claude Code’s existing authentication.
# Use directly via --provider flag (no config needed)
term-llm ask --provider claude-bin "explain this code"
term-llm ask --provider claude-bin:haiku "quick question" # use haiku model
term-llm exec --provider claude-bin "list files" # command suggestions
term-llm ask --provider claude-bin -s "latest news" # with web search
# Or configure as default
# In ~/.config/term-llm/config.yaml
default_provider: claude-bin
providers:
claude-bin:
model: sonnet # opus, sonnet, or haiku
env:
IS_SANDBOX: "1" # useful in trusted/sandboxed containers
# Generate a long-lived token with: claude setup-token
# Useful in CI or headless environments where interactive login isn't possible
CLAUDE_CODE_OAUTH_TOKEN: "your-oauth-token-here"
# Optional: Claude Code hooks are disabled by default; set to true to opt back in
# enable_hooks: true
Features:
- No API key required - uses Claude Code’s existing authentication
- Full tool support via MCP (exec, search, edit all work)
- Model selection:
opus,sonnet(default),haiku - Claude Code hooks are disabled by default to keep user hook automation out of term-llm inference sessions
- Optional
providers.claude-bin.enable_hooks: trueto opt back into Claude Code hooks - Optional
providers.claude-bin.envpassthrough for Claude subprocess settings (for exampleIS_SANDBOX=1in trusted root-run containers) providers.<name>.envvalues support the same deferred resolution as other config values, includingfile://...#json.path,op://..., and$()- Works immediately if Claude Code is installed and logged in
Option 10: Use existing CLI credentials (gemini-cli)
If you have gemini-cli installed and logged in, term-llm can use those credentials directly:
# Use gemini-cli credentials (no config needed)
term-llm ask --provider gemini-cli "explain this code"
Or configure as default:
# In ~/.config/term-llm/config.yaml
default_provider: gemini-cli # uses ~/.gemini/oauth_creds.json
OpenAI-compatible providers support two URL options:
base_url: Base URL (e.g.,https://api.cerebras.ai/v1) -/chat/completionsis appended automaticallyurl: Full URL (e.g.,https://api.cerebras.ai/v1/chat/completions) - used as-is without appending
Use url when your endpoint doesn’t follow the standard /chat/completions path, or to paste URLs directly from API documentation.
Option 11: Use GitHub Copilot
If you have GitHub Copilot (free, Individual, or Business), you can use the copilot provider with OAuth device flow authentication:
term-llm ask --provider copilot "explain this code"
term-llm ask --provider copilot:claude-opus-4.5 "complex question"
On first use, you’ll be prompted to authenticate via GitHub device flow. Credentials are stored locally and refreshed automatically.
# In ~/.config/term-llm/config.yaml
default_provider: copilot
providers:
copilot:
model: gpt-4.1 # free tier, or gpt-5.2-codex for paid
Available models:
| Model | Description |
|---|---|
gpt-4.1 |
Default, works on free tier |
gpt-5.2-codex |
Advanced coding model (paid) |
gpt-5.1 |
GPT-5.1 (paid) |
claude-opus-4.5 |
Claude Opus 4.5 via Copilot (paid) |
gemini-3-pro |
Gemini 3 Pro via Copilot (paid) |
grok-code-fast-1 |
Grok coding model via Copilot (paid) |
Features:
- Free tier with GPT-4.1 (no Copilot subscription required, just GitHub account)
- OAuth device flow authentication (no API key needed)
- Full tool support via MCP