Skip to main content

Features Overview

Hermes Agent includes a rich set of capabilities that extend far beyond basic chat. From persistent memory and file-aware context to browser automation and voice conversations, these features work together to make Hermes a powerful autonomous assistant.

Core

  • Tools & Toolsets — Tools are functions that extend the agent's capabilities. They're organized into logical toolsets that can be enabled or disabled per platform, covering web search, terminal execution, file editing, memory, delegation, and more.
  • Skills System — On-demand knowledge documents the agent can load when needed. Skills follow a progressive disclosure pattern to minimize token usage and are compatible with the agentskills.io open standard.
  • Persistent Memory — Bounded, curated memory that persists across sessions. Hermes remembers your preferences, projects, environment, and things it has learned via MEMORY.md and USER.md.
  • Context Files — Hermes automatically discovers and loads project context files (.hermes.md, AGENTS.md, CLAUDE.md, SOUL.md, .cursorrules) that shape how it behaves in your project.
  • Context References — Type @ followed by a reference to inject files, folders, git diffs, and URLs directly into your messages. Hermes expands the reference inline and appends the content automatically.
  • Checkpoints — Hermes automatically snapshots your working directory before making file changes, giving you a safety net to roll back with /rollback if something goes wrong.

Automation

  • Scheduled Tasks (Cron) — Schedule tasks to run automatically with natural language or cron expressions. Jobs can attach skills, deliver results to any platform, and support pause/resume/edit operations.
  • Subagent Delegation — The delegate_task tool spawns child agent instances with isolated context, restricted toolsets, and their own terminal sessions. Run up to 3 concurrent subagents for parallel workstreams.
  • Code Execution — The execute_code tool lets the agent write Python scripts that call Hermes tools programmatically, collapsing multi-step workflows into a single LLM turn via sandboxed RPC execution.
  • Event Hooks — Run custom code at key lifecycle points. Gateway hooks handle logging, alerts, and webhooks; plugin hooks handle tool interception, metrics, and guardrails.
  • Batch Processing — Run the Hermes agent across hundreds or thousands of prompts in parallel, generating structured ShareGPT-format trajectory data for training data generation or evaluation.

Media & Web

  • Voice Mode — Full voice interaction across CLI and messaging platforms. Talk to the agent using your microphone, hear spoken replies, and have live voice conversations in Discord voice channels.
  • Browser Automation — Full browser automation with multiple backends: Browserbase cloud, Browser Use cloud, local Chrome via CDP, or local Chromium. Navigate websites, fill forms, and extract information.
  • Vision & Image Paste — Multimodal vision support. Paste images from your clipboard into the CLI and ask the agent to analyze, describe, or work with them using any vision-capable model.
  • Image Generation — Generate images from text prompts using FAL.ai's FLUX 2 Pro model with automatic 2x upscaling via the Clarity Upscaler.
  • Voice & TTS — Text-to-speech output and voice message transcription across all messaging platforms, with four provider options: Edge TTS (free), ElevenLabs, OpenAI TTS, and NeuTTS.

Customization

  • Personality & SOUL.md — Fully customizable agent personality. SOUL.md is the primary identity file — the first thing in the system prompt — and you can swap in built-in or custom /personality presets per session.
  • Skins & Themes — Customize the CLI's visual presentation: banner colors, spinner faces and verbs, response-box labels, branding text, and the tool activity prefix.
  • Plugins — Add custom tools, hooks, and integrations without modifying core code. Drop a directory into ~/.hermes/plugins/ with a plugin.yaml and Python code.