跳到主要内容

Code Wiki

Generate wiki docs + Mermaid diagrams for any codebase.

Skill metadata

SourceOptional — install with hermes skills install official/software-development/code-wiki
Pathoptional-skills/software-development/code-wiki
Version0.1.0
AuthorTeknium (teknium1), Hermes Agent
LicenseMIT
Platformslinux, macos, windows
TagsDocumentation, Mermaid, Architecture, Diagrams, Wiki, Code-Analysis
Related skillscodebase-inspection, github-repo-management

Reference: full SKILL.md

信息

The following is the complete skill definition that Hermes loads when this skill is triggered. This is what the agent sees as instructions when the skill is active.

Code Wiki Skill

Generate a comprehensive wiki for any codebase — overview, architecture, per-module deep-dives, Mermaid class and sequence diagrams. Inspired by Google CodeWiki, but works on local repos, private repos, and any language. Uses only existing Hermes tools (terminal, read_file, search_files, write_file); no Docker, no external services, no extra dependencies.

This skill produces reference documentation (what/how). It does not produce strategic narrative (why — that's a different skill).

When to Use

  • User says "document this codebase", "generate a wiki", "make architecture diagrams"
  • Onboarding to an unfamiliar repo and wants a structured reference
  • User points at a GitHub URL and asks for documentation
  • Need a stable artifact (markdown + Mermaid) that renders on GitHub

Do NOT use this for:

  • Single-file or single-function documentation — just answer directly
  • API reference for one specific endpoint — use read_file and answer inline
  • Strategic "why does this exist" narrative — different skill, different purpose
  • Codebases the user is actively developing in this session — just answer questions as they come

Prerequisites

  • No env vars required.
  • git on PATH for repo SHA tracking and remote clones.
  • Optional: pygount for language-breakdown stats (see the codebase-inspection skill).

How to Run

Invoke through the terminal tool from the target repo's root, then use read_file / search_files / write_file to produce the wiki. Default output location is ~/.hermes/wikis/<repo-name>/. Only write into the repo (docs/wiki/) when the user explicitly requests it.

Quick Reference

StepAction
1Resolve target — local cwd, given path, or git clone --depth 50 <url> to a temp dir
2Scan structure — ls, find -maxdepth 3, manifest files, README
3Pick 8–10 modules to document
4Write README.md (overview + module map)
5Write architecture.md with Mermaid flowchart
6Write per-module docs in modules/
7Write diagrams/class-diagram.md (Mermaid classDiagram)
8Write diagrams/sequences.md (Mermaid sequenceDiagram, 2–4 workflows)
9Write getting-started.md
10Write api.md if applicable, else skip
11Write .codewiki-state.json
12Report paths to user

Procedure

1. Resolve the target

For a GitHub URL:

WIKI_TMP=$(mktemp -d)
git clone --depth 50 <url> "$WIKI_TMP/repo"
cd "$WIKI_TMP/repo"
REPO_SHA=$(git rev-parse HEAD)
REPO_NAME=$(basename <url> .git)

For a local path (or cwd if none given):

cd <path>
REPO_SHA=$(git rev-parse HEAD 2>/dev/null || echo "uncommitted")
REPO_NAME=$(basename "$PWD")

Then set the output dir:

OUTPUT_DIR="$HOME/.hermes/wikis/$REPO_NAME"
mkdir -p "$OUTPUT_DIR/modules" "$OUTPUT_DIR/diagrams"

2. Scan repo structure

Use the terminal tool for the shell work, read_file for manifests:

# Shallow tree first
ls -la

# Deeper tree, noise filtered
find . -type d \
-not -path '*/\.*' \
-not -path '*/node_modules*' \
-not -path '*/venv*' \
-not -path '*/__pycache__*' \
-not -path '*/dist*' \
-not -path '*/build*' \
-not -path '*/target*' \
-maxdepth 3 | sort

# Language breakdown (skip if pygount unavailable)
pygount --format=summary \
--folders-to-skip=".git,node_modules,venv,.venv,__pycache__,.cache,dist,build,target" \
. 2>/dev/null || true

Then read_file the relevant manifests (package.json, pyproject.toml, setup.py, Cargo.toml, go.mod, pom.xml, build.gradle) and the project README. Use search_files target='files' to find them rather than guessing names.

3. Pick modules to document

Cap initial pass at 8–10 modules. Heuristics by language:

  • Python: top-level packages (dirs with __init__.py), plus subsystem dirs
  • JS/TS: src/<subdir>, top-level workspace dirs
  • Rust: each crate in a workspace, or top-level src/<module> dirs
  • Go: each top-level package directory
  • Mixed/unfamiliar: top-level directories that contain source code (not config, not tests)

For very large repos, prioritize by:

  1. Imported-from count (a module imported by many is core)
  2. LOC (bigger modules usually warrant their own doc)
  3. Mentions in README / top-level docs

State the module list to the user before generating per-module docs on big repos — gives them a chance to redirect.

4. Write README.md

read_file the actual project README plus the top 2–3 entry-point files. Then write_file:

# <Project Name>

<One paragraph: what it is and what it's for. Self-contained don't assume the
reader has the source README.>

## Key Concepts

- **<Concept 1>**<one line>
- **<Concept 2>**<one line>

## Entry Points

- [`path/to/main.py`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)<what runs when you start it>
- [`path/to/cli.py`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)<CLI surface>

## High-Level Architecture

<2-3 sentences. Detail goes in architecture.md.>

See [architecture.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/architecture.md).

## Module Map

| Module | Purpose |
|---|---|
| [`<module>`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/modules/<module>.md) | <one-line purpose> |

## Getting Started

See [getting-started.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/getting-started.md).

For link targets in local mode use relative paths. For cloned repos use https://github.com/<owner>/<repo>/blob/<sha>/<path> so links survive future commits.

5. Write architecture.md

# Architecture

<2-3 paragraphs: shape of the system. What talks to what. Where data enters,
where it exits, where state lives.>

## Components

- **<Component>** — <1-2 sentences>. See [`modules/<module>.md`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/modules/<module>.md).

## System Diagram

```mermaid
flowchart TD
User([User]) --> Entry[Entry Point]
Entry --> Core[Core Engine]
Core --> StorageA[(Database)]
Core --> ExternalAPI{{External API}}
```

## Data Flow

1. **<Step>**[`<file>`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)
2. **<Step>**[`<file>`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)

## Key Design Decisions

- <Anything load-bearing the reader should know>

Mermaid shape semantics:

  • [] = component
  • [()] = database / storage
  • {{}} = external service
  • (()) = entry point or terminal
  • --> = sync call, -.-> = async/event

Cap at ~20 nodes per diagram. Split into sub-diagrams if larger.

6. Write per-module docs in modules/

For each selected module, inspect its layout with ls, identify 3–5 most important files (by size, by being named core.py / main.py / __init__.py, by being imported a lot), then read_file those files (use offset / limit to read only what you need; prefer search_files for specific symbols).

# Module: `<module>`

<1-2 sentence purpose.>

## Responsibilities

- <bullet>
- <bullet>

## Key Files

- [`<module>/<file>`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)<what it does>

## Public API

<Functions/classes/constants other code uses. Group related items. Show
signatures, not full implementations.>

## Internal Structure

<How the module is organized internally. State management.>

## Dependencies

- **Used by:** <other modules>
- **Uses:** <other modules + external libs>

## Notable Patterns / Gotchas

- <Anything non-obvious>

7. Write diagrams/class-diagram.md

Pick the 5–10 most important classes/types. read_file them, then write:

# Class Diagram

## Core Types

```mermaid
classDiagram
class Agent {
+string name
+list~Tool~ tools
+chat(message) string
}
class Tool {
<<interface>>
+name string
+execute(args) any
}
Agent --> Tool : uses
Tool <|-- TerminalTool
Tool <|-- WebTool
```

## Notes

<Anything the diagram can't express lifecycle, threading, etc.>

For languages without classes (Go, C, Rust): use the diagram for struct relationships, or skip class-diagram.md and explain it in prose in architecture.md. Don't force-fit.

8. Write diagrams/sequences.md

Pick 2–4 of the most important workflows. Trace each call path through the code (read entry point, follow function calls), then:

# Sequence Diagrams

## Workflow: <Name>

<1 sentence describing what this does and when it runs.>

```mermaid
sequenceDiagram
participant User
participant CLI
participant Agent
participant LLM
User->>CLI: types message
CLI->>Agent: chat(message)
Agent->>LLM: API call
LLM-->>Agent: response + tool_calls
Agent->>Agent: execute tools
Agent-->>CLI: final response
```

### Walkthrough

1. **User input**[`cli.py:HermesCLI.run_session`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)
2. **Message dispatch**[`run_agent.py:AIAgent.chat`](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/<link>)

Don't invent participants. Every box must correspond to a real component the reader can find in the code.

9. Write getting-started.md

# Getting Started

## Prerequisites

<From manifest files + README. Be specific versions if pinned.>

## Installation

```bash
<exact commands>
```

## First Run

```bash
<minimum command to see the system do something useful>
```

## Common Workflows

### <Workflow 1>
<commands>

## Configuration

- `<config-file>`<what it controls>
- Env var `<VAR>`<what it controls>

## Where to Go Next

- Architecture: [architecture.md](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/architecture.md)
- Module reference: [README.md#module-map](https://github.com/NousResearch/hermes-agent/blob/main/optional-skills/software-development/code-wiki/README.md#module-map)

10. Write api.md (skip if not applicable)

Only write this if the project is a library or API server. If it is:

  • Find the public API surface (__init__.py exports, OpenAPI specs, route handlers, exported types)
  • Document each public entry with signature, parameters, return type, one-line description
  • Group by category

11. Write the state file

cat > "$OUTPUT_DIR/.codewiki-state.json" <<EOF
{
"repo_name": "$REPO_NAME",
"source_path": "$PWD",
"source_sha": "$REPO_SHA",
"generated_at": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"generator": "hermes-agent code-wiki skill v0.1.0",
"modules_documented": []
}
EOF

12. Report to user

State exactly what was generated and where:

Generated wiki at ~/.hermes/wikis/<repo-name>/:
README.md project overview, module map
architecture.md system architecture + flowchart
getting-started.md setup, first run, workflows
modules/<N files> per-module deep-dives
diagrams/architecture.md Mermaid flowchart
diagrams/class-diagram.md Mermaid class diagram
diagrams/sequences.md Mermaid sequence diagrams

If you cloned to a temp dir, remind the user it can be removed (rm -rf "$WIKI_TMP") after they've reviewed the wiki.

Scope Control

Generating a full wiki for a 500K-LOC monorepo is wildly token-expensive. Default to bounded scope:

  • Initial scan: max depth 3 directories
  • Per-module docs: cap at 10 modules unless user expands scope
  • Per-file reads: prefer search_files for symbols + read_file with offset/limit over full reads
  • Skip vendored code (vendor/, third_party/, generated code, _pb2.py, .min.js)

If the user says "do the whole thing exhaustively", believe them — but ballpark the cost first: "this repo has ~340 source files, comprehensive coverage will be expensive — confirm?"

Re-Run / Update

If .codewiki-state.json already exists at the target path:

  • Read it for previous SHA and module list
  • If source SHA matches: ask user if they want to regenerate or skip
  • If SHA differs: offer to regenerate only modules with changed files (git diff --name-only <old-sha> HEAD)

Full incremental-regeneration is a future enhancement — for now, regenerating the whole thing is acceptable.

Pitfalls

  • Fabricating components. Every diagram node and claimed function call must be in the source. read_file before writing. The single biggest failure mode for auto-generated docs is plausible-sounding fabrication.
  • Generic AI prose. "This module is responsible for..." is content-free. Say what the module actually does in domain-specific terms.
  • Restating code as prose. A module doc that says "the process function processes things by calling process_item on each item" is worse than just linking to the function.
  • Mermaid > 50 nodes. They don't render legibly. Split them.
  • Documenting tests, generated code, or vendored deps as if they were product code. Skip them.
  • In-repo output without asking. Default is ~/.hermes/wikis/. Only write into the repo when the user explicitly requests it.
  • Mermaid special chars need quotes: A["Tool / Agent"] not A[Tool / Agent]. <br> for line breaks inside a node.
  • Nested code fences in SKILL.md. When writing a markdown example that contains a Mermaid block, use 4-backtick outer fences so the 3-backtick inner ```mermaid doesn't close the outer. (This SKILL.md does it.)
  • classDiagram generics render as ~T~ (e.g. List~Tool~), not <T>.
  • GitHub Mermaid theme is fixed — don't include %%{init: ...}%% blocks; they're stripped on render.

Verification

After writing, verify:

  1. Mermaid blocks balance — opens equal closes per file:
    for f in "$OUTPUT_DIR"/diagrams/*.md "$OUTPUT_DIR"/architecture.md; do
    opens=$(grep -c '^```mermaid' "$f")
    total=$(grep -c '^```' "$f")
    echo "$f: $opens mermaid blocks, $total total fences (expect total = opens*2)"
    done
  2. All expected files exist
    ls "$OUTPUT_DIR"/{README.md,architecture.md,getting-started.md,.codewiki-state.json} \
    "$OUTPUT_DIR"/modules/ "$OUTPUT_DIR"/diagrams/
  3. Module count matches what you intendedls "$OUTPUT_DIR/modules" | wc -l should equal the number of modules you committed to in Step 3.
  4. No fabricated paths — sanity-check 2–3 source links resolve to real files.