Configuration
All settings are stored in the ~/.hermes/ directory for easy access.
Directory Structure
~/.hermes/
├── config.yaml # Settings (model, terminal, TTS, compression, etc.)
├── .env # API keys and secrets
├── auth.json # OAuth provider credentials (Nous Portal, etc.)
├── SOUL.md # Optional: global persona (agent embodies this personality)
├── memories/ # Persistent memory (MEMORY.md, USER.md)
├── skills/ # Agent-created skills (managed via skill_manage tool)
├── cron/ # Scheduled jobs
├── sessions/ # Gateway sessions
└── logs/ # Logs (errors.log, gateway.log — secrets auto-redacted)
Managing Configuration
hermes config # View current configuration
hermes config edit # Open config.yaml in your editor
hermes config set KEY VAL # Set a specific value
hermes config check # Check for missing options (after updates)
hermes config migrate # Interactively add missing options
# Examples:
hermes config set model anthropic/claude-opus-4
hermes config set terminal.backend docker
hermes config set OPENROUTER_API_KEY sk-or-... # Saves to .env
The hermes config set command automatically routes values to the right file — API keys are saved to .env, everything else to config.yaml.
Configuration Precedence
Settings are resolved in this order (highest priority first):
- CLI arguments — e.g.,
hermes chat --model anthropic/claude-sonnet-4(per-invocation override) ~/.hermes/config.yaml— the primary config file for all non-secret settings~/.hermes/.env— fallback for env vars; required for secrets (API keys, tokens, passwords)- Built-in defaults — hardcoded safe defaults when nothing else is set
Secrets (API keys, bot tokens, passwords) go in .env. Everything else (model, terminal backend, compression settings, memory limits, toolsets) goes in config.yaml. When both are set, config.yaml wins for non-secret settings.
Inference Providers
You need at least one way to connect to an LLM. Use hermes model to switch providers and models interactively, or configure directly:
| Provider | Setup |
|---|---|
| Nous Portal | hermes model (OAuth, subscription-based) |
| OpenAI Codex | hermes model (ChatGPT OAuth, uses Codex models) |
| OpenRouter | OPENROUTER_API_KEY in ~/.hermes/.env |
| Custom Endpoint | OPENAI_BASE_URL + OPENAI_API_KEY in ~/.hermes/.env |
The OpenAI Codex provider authenticates via device code (open a URL, enter a code). Credentials are stored at ~/.codex/auth.json and auto-refresh. No Codex CLI installation required.
Even when using Nous Portal, Codex, or a custom endpoint, some tools (vision, web summarization, MoA) use OpenRouter independently. An OPENROUTER_API_KEY enables these tools.
Optional API Keys
| Feature | Provider | Env Variable |
|---|---|---|
| Web scraping | Firecrawl | FIRECRAWL_API_KEY |
| Browser automation | Browserbase | BROWSERBASE_API_KEY, BROWSERBASE_PROJECT_ID |
| Image generation | FAL | FAL_KEY |
| Premium TTS voices | ElevenLabs | ELEVENLABS_API_KEY |
| OpenAI TTS + voice transcription | OpenAI | VOICE_TOOLS_OPENAI_KEY |
| RL Training | Tinker + WandB | TINKER_API_KEY, WANDB_API_KEY |
| Cross-session user modeling | Honcho | HONCHO_API_KEY |
OpenRouter Provider Routing
When using OpenRouter, you can control how requests are routed across providers. Add a provider_routing section to ~/.hermes/config.yaml:
provider_routing:
sort: "throughput" # "price" (default), "throughput", or "latency"
# only: ["anthropic"] # Only use these providers
# ignore: ["deepinfra"] # Skip these providers
# order: ["anthropic", "google"] # Try providers in this order
# require_parameters: true # Only use providers that support all request params
# data_collection: "deny" # Exclude providers that may store/train on data
Shortcuts: Append :nitro to any model name for throughput sorting (e.g., anthropic/claude-sonnet-4:nitro), or :floor for price sorting.
Terminal Backend Configuration
Configure which environment the agent uses for terminal commands:
terminal:
backend: local # or: docker, ssh, singularity, modal
cwd: "." # Working directory ("." = current dir)
timeout: 180 # Command timeout in seconds
See Code Execution and the Terminal section of the README for details on each backend.
Memory Configuration
memory:
memory_enabled: true
user_profile_enabled: true
memory_char_limit: 2200 # ~800 tokens
user_char_limit: 1375 # ~500 tokens
Context Compression
compression:
enabled: true
threshold: 0.85 # Compress at 85% of context limit
Reasoning Effort
Control how much "thinking" the model does before responding:
agent:
reasoning_effort: "" # empty = use model default. Options: xhigh (max), high, medium, low, minimal, none
When unset (default), the model's own default reasoning level is used. Setting a value overrides it — higher reasoning effort gives better results on complex tasks at the cost of more tokens and latency.
TTS Configuration
tts:
provider: "edge" # "edge" | "elevenlabs" | "openai"
edge:
voice: "en-US-AriaNeural" # 322 voices, 74 languages
elevenlabs:
voice_id: "pNInz6obpgDQGcFmaJgB"
model_id: "eleven_multilingual_v2"
openai:
model: "gpt-4o-mini-tts"
voice: "alloy" # alloy, echo, fable, onyx, nova, shimmer
Display Settings
display:
tool_progress: all # off | new | all | verbose
personality: "kawaii" # Default personality for the CLI
compact: false # Compact output mode (less whitespace)
| Mode | What you see |
|---|---|
off | Silent — just the final response |
new | Tool indicator only when the tool changes |
all | Every tool call with a short preview (default) |
verbose | Full args, results, and debug logs |
Speech-to-Text (STT)
stt:
provider: "openai" # STT provider
Requires VOICE_TOOLS_OPENAI_KEY in .env for OpenAI STT.
Human Delay
Simulate human-like response pacing in messaging platforms:
human_delay:
mode: "off" # off | natural | custom
min_ms: 500 # Minimum delay (custom mode)
max_ms: 2000 # Maximum delay (custom mode)
Code Execution
Configure the sandboxed Python code execution tool:
code_execution:
timeout: 300 # Max execution time in seconds
max_tool_calls: 50 # Max tool calls within code execution
Delegation
Configure subagent behavior for the delegate tool:
delegation:
max_iterations: 50 # Max iterations per subagent
default_toolsets: # Toolsets available to subagents
- terminal
- file
- web
Clarify
Configure the clarification prompt behavior:
clarify:
timeout: 120 # Seconds to wait for user clarification response
Context Files (SOUL.md, AGENTS.md)
Drop these files in your project directory and the agent automatically picks them up:
| File | Purpose |
|---|---|
AGENTS.md | Project-specific instructions, coding conventions |
SOUL.md | Persona definition — the agent embodies this personality |
.cursorrules | Cursor IDE rules (also detected) |
.cursor/rules/*.mdc | Cursor rule files (also detected) |
- AGENTS.md is hierarchical: if subdirectories also have AGENTS.md, all are combined.
- SOUL.md checks cwd first, then
~/.hermes/SOUL.mdas a global fallback. - All context files are capped at 20,000 characters with smart truncation.
Working Directory
| Context | Default |
|---|---|
CLI (hermes) | Current directory where you run the command |
| Messaging gateway | Home directory ~ (override with MESSAGING_CWD) |
| Docker / Singularity / Modal / SSH | User's home directory inside the container or remote machine |
Override the working directory:
# In ~/.hermes/.env or ~/.hermes/config.yaml:
MESSAGING_CWD=/home/myuser/projects # Gateway sessions
TERMINAL_CWD=/workspace # All terminal sessions