Building a Context Engine Plugin

Context engine plugins replace the built-in ContextCompressor with an alternative strategy for managing conversation context. For example, a Lossless Context Management (LCM) engine that builds a knowledge DAG instead of lossy summarization.

How it works

The agent's context management is built on the ContextEngine ABC (agent/context_engine.py). The built-in ContextCompressor is the default implementation. Plugin engines must implement the same interface.

Only one context engine can be active at a time. Selection is config-driven:

# config.yaml
context:
  engine: "compressor"    # default built-in
  engine: "lcm"           # activates a plugin engine named "lcm"

Plugin engines are never auto-activated — the user must explicitly set context.engine to the plugin's name.

Directory structure

Each context engine lives in plugins/context_engine/<name>/:

plugins/context_engine/lcm/
├── __init__.py      # exports the ContextEngine subclass
├── plugin.yaml      # metadata (name, description, version)
└── ...              # any other modules your engine needs

The ContextEngine ABC

Your engine must implement these required methods:

from agent.context_engine import ContextEngine

class LCMEngine(ContextEngine):

    @property
    def name(self) -> str:
        """Short identifier, e.g. 'lcm'. Must match config.yaml value."""
        return "lcm"

    def update_from_response(self, usage: dict) -> None:
        """Called after every LLM call with the usage dict.

        Update self.last_prompt_tokens, self.last_completion_tokens,
        self.last_total_tokens from the response.
        """

    def should_compress(self, prompt_tokens: int = None) -> bool:
        """Return True if compaction should fire this turn."""

    def compress(self, messages: list, current_tokens: int = None,
                 focus_topic: str = None) -> list:
        """Compact the message list and return a new (possibly shorter) list.

        The returned list must be a valid OpenAI-format message sequence.

        ``focus_topic`` is an optional topic string from manual
        ``/compress <focus>``; engines that support guided compression should
        prioritise preserving information related to it, others may ignore it.
        """

Class attributes your engine must maintain

The agent reads these directly for display and logging:

last_prompt_tokens: int = 0
last_completion_tokens: int = 0
last_total_tokens: int = 0
threshold_tokens: int = 0        # when compression triggers
context_length: int = 0          # model's full context window
compression_count: int = 0       # how many times compress() has run

Optional methods

These have sensible defaults in the ABC. Override as needed:

Method	Default	Override when
`on_session_start(session_id, **kwargs)`	No-op	You need to load persisted state (DAG, DB)
`on_session_end(session_id, messages)`	No-op	You need to flush state, close connections
`on_session_reset()`	Resets token counters	You have per-session state to clear
`update_model(model, context_length, ...)`	Updates context_length + threshold	You need to recalculate budgets on model switch
`get_tool_schemas()`	Returns `[]`	Your engine provides agent-callable tools (e.g., `lcm_grep`)
`handle_tool_call(name, args, **kwargs)`	Returns error JSON	You implement tool handlers
`should_compress_preflight(messages)`	Returns `False`	You can do a cheap pre-API-call estimate
`get_status()`	Standard token/threshold dict	You have custom metrics to expose

Engine tools

Context engines can expose tools the agent calls directly. Return schemas from get_tool_schemas() and handle calls in handle_tool_call():

def get_tool_schemas(self):
    return [{
        "name": "lcm_grep",
        "description": "Search the context knowledge graph",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"],
        },
    }]

def handle_tool_call(self, name, args, **kwargs):
    if name == "lcm_grep":
        results = self._search_dag(args["query"])
        return json.dumps({"results": results})
    return json.dumps({"error": f"Unknown tool: {name}"})

Engine tools are injected into the agent's tool list at startup and dispatched automatically — no registry registration needed.

Registration

Via directory (recommended)

Place your engine in plugins/context_engine/<name>/. The __init__.py must export a ContextEngine subclass. The discovery system finds and instantiates it automatically.

Via general plugin system

A general plugin can also register a context engine:

def register(ctx):
    engine = LCMEngine(context_length=200000)
    ctx.register_context_engine(engine)

Only one engine can be registered. A second plugin attempting to register is rejected with a warning.

Lifecycle

Engine instantiated (plugin load or directory discovery)
on_session_start() — conversation begins
update_from_response() — after each API call
should_compress() — checked each turn
compress() — called when should_compress() returns True
on_session_end() — session boundary (CLI exit, /reset, gateway expiry)

on_session_reset() is called on /new or /reset to clear per-session state without a full shutdown.

Configuration

Users select your engine via hermes plugins → Provider Plugins → Context Engine, or by editing config.yaml:

context:
  engine: "lcm"   # must match your engine's name property

The compression config block (compression.threshold, compression.protect_last_n, etc.) is specific to the built-in ContextCompressor. Your engine should define its own config format if needed, reading from config.yaml during initialization.

Testing

from agent.context_engine import ContextEngine

def test_engine_satisfies_abc():
    engine = YourEngine(context_length=200000)
    assert isinstance(engine, ContextEngine)
    assert engine.name == "your-name"

def test_compress_returns_valid_messages():
    engine = YourEngine(context_length=200000)
    msgs = [{"role": "user", "content": "hello"}]
    result = engine.compress(msgs)
    assert isinstance(result, list)
    assert all("role" in m for m in result)

See tests/agent/test_context_engine.py for the full ABC contract test suite.

How it works​

Directory structure​

The ContextEngine ABC​

Class attributes your engine must maintain​

Optional methods​

Engine tools​

Registration​

Via directory (recommended)​

Via general plugin system​

Lifecycle​

Configuration​

Testing​

See also​