TINY BOTS KINGDOM AUDIT 2026 03 05

Updated 1 month ago
# Kingdom Tiny Bot Architecture Report
## Audit Date: 2026-03-05 | Auditor: FORGE_CLAUDE Sonnet Drone

---

## PART 1: Kingdom Audit — What's Failing or Fragile

### The Big Problem: 42 Daemons, Mostly Monoliths

The Kingdom runs 42 launchd agents. That number is not inherently bad, but most of them launch long-running processes with internal state, retry loops, or web servers. When any one of them dies, its state dies with it unless it was written to disk first. The GPU bug that took the Console offline for an entire night (S160) is the canonical example: one process, one crash, one dead UI, no recovery.

Below is a breakdown of the most fragile systems, ordered by blast radius.

---

### 1.1 Overmind Pulse — `pulse.sh` (666 lines)

**Fragility profile:** This is the most critical monolith in the Kingdom. It runs every 10 minutes via launchd and does everything in sequence: acquires a lock, reads the SQLite DB, evaluates mission readiness, builds a prompt, calls aeris-delegate, parses the response, writes state back to DB, sends RAVEN alerts, handles backoff, and cleans up. All in one bash process.

**Single points of failure:**
- The lock is a `/tmp/overmind_pulse.lock` directory. A crash during cleanup leaves a stale lock that blocks the next 10-minute tick. There's stale-lock recovery logic, but it requires `pulse.sh` to re-run and detect the orphan — a second tick lost.
- If aeris-delegate hangs past GEMINI_TIMEOUT=300s, a watcher subprocess fires SIGKILL. But the `set -euo pipefail` at the top means any unexpected exit can abort mid-mission with partially-written state.
- Mission state lives in two places simultaneously: `mission_state/${id}_state.txt` files AND the SQLite DB. These can drift.

**Tiny bot replacement — mission dispatch decomposition:**

| Bot | Trigger | Single Job | Output |
|-----|---------|-----------|--------|
| `pulse-ready-check` | Cron every 10 min | Read DB, write list of due mission IDs to `/tmp/pulse_ready.json` | File drop |
| `mission-dispatch-{id}` | File drop of `/tmp/pulse_ready.json` | One bot per mission ID; reads prompt, calls aeris-delegate, writes raw output to `/tmp/mission_out_{id}.txt` | File drop |
| `mission-commit` | File drop of output file | Parses output, INSERTs to DB, updates state file | SQLite row + file |
| `pulse-alert` | DB change (new failure) | If failure count crossed threshold, drop RAVEN envelope | File drop to mailbox |

Each bot runs, does one thing, exits. No lock needed — each dispatch is independent. If one mission's bot dies, the others finish fine.

---

### 1.2 RAVEN v2 — `raven.py` Daemon

**Fragility profile:** Python daemon with web UI (`localhost:8768`), SQLite (`raven.db`), a filesystem watcher loop over 5 mailbox zones, and a delivery subsystem — all in one process. Two plists for one logical system already signals scope creep.

**Single points of failure:**
- If `raven.py` crashes during the watcher loop, messages stay in buffer unprocessed until daemon restarts.
- `raven.db` concurrent writes from watcher + web server + CLI without WAL mode = corruption risk.
- The web server lives inside the same process as the watcher. A flask crash kills message delivery.

**Tiny bot decomposition (what RAVEN 3 already achieves for routing — this is the next tier):**

| Bot | Trigger | Single Job | Output |
|-----|---------|-----------|--------|
| `raven-ingest` | launchd WatchPaths / QueueDirectory | Moves envelope files to staging, validates format | File drop |
| `raven-parser` | File in staging | Parses envelope, INSERTs to `raven.db` (WAL mode) | SQLite INSERT |
| `raven-notify` | DB INSERT trigger via polling | Drops notification to online agent | HTTP POST or RAVEN |
| `raven-web` | Independent LaunchAgent | Serves `localhost:8768` — reads DB only, never writes | HTTP |

---

### 1.3 Goldfish v2 — Vision Witness Chain

**Fragility:** 6-stage pipeline managed across multiple launchd agents. `llava:latest` (4.7GB) is heavyweight and slow. If Herald fails to narrate a capture, that capture goes dark — no replay mechanism.

**Tiny bot fix:** Each stage writes output as a timestamped JSON file. Next stage polls for unprocessed files. Any stage can replay independently. Replace llava with moondream:1.8b (already installed) for fast pass, qwen2.5vl:3b for quality pass.

---

### 1.4 AI News Monitor — `ainews.sh`

**Fragility:** One script does Tavily fetch + LLM summarize + write. If Tavily is down, no brief. No retry, no fallback.

**Tiny bot fix:** (1) `news-fetch` → raw JSON drop. (2) `news-summarize` → triggered by drop. (3) `news-deliver` → triggered by summary. Any stage retries independently.

---

### 1.5 Token Sentinel + Kingdom Super API

**Fragility:** Two long-running Flask apps. Sentinel miners are polling daemons that can go stale silently.

**Tiny bot fix:** Miners = cron-fired bots that wake, scrape one source, INSERT one row, exit. Super API = thin read-only HTTP layer over SQLite. Writes go through file drops → ingestion bots.

---

## PART 2: Creative Tiny Bot Ideas (Ordered by Build Time)

| Bot | Trigger | Single Job | Build Time |
|-----|---------|-----------|------------|
| **BOT-01: Dead Silence Bot** | Cron daily 09:00 | Check SCRYER daily-synthesis timestamp. If >36h old, RAVEN FORGE_CLAUDE URGENT. | **15 min** |
| **BOT-02: Spend Spike Bot** | Cron every 30 min | Query Sentinel DB for last 2hr spend. If >$8, POST Super API + RAVEN. | **20 min** |
| **BOT-03: Mission Stuck Bot** | Cron every 6hr | Query overmind.db for missions where last_run >24h AND status=active. RAVEN per stuck mission. | **20 min** |
| **BOT-04: Screenshot Describer** | FSEvents on Goldfish folder | Run moondream:1.8b on each new PNG → TIMESTREAM screen_descriptions.jsonl | **30 min** |
| **BOT-05: Mailbox Health Bot** | Cron hourly | Count buffer/ files >30min old. If any >0, RAVEN alert. | **20 min** |
| **BOT-06: Goldfish Replay Bot** | Cron daily midnight | Find captures with no Herald narration. Re-queue for narration. | **30 min** |
| **BOT-07: Aeris Heartbeat Monitor** | Cron every 5 min | Check AExGO_activity.json timestamp. If >120s stale + 08-22h window + opencode not running → RAVEN. | **25 min** |
| **BOT-08: Token Budget Bot** | Cron daily 08:00 | Compare yesterday spend to 7d avg. If >120%, Console NEWS brief. | **20 min** |
| **BOT-09: Git Drift Bot** | Cron every 6hr | Count untracked+modified in THE_FORGE. If >50, RAVEN with list. | **10 min** |
| **BOT-10: SCRYER Stream Gap Detector** | Cron daily 10:00 | Check all 23 stream timestamps. Any >48h old → Console NEWS gap report. | **20 min** |

**Highest leverage:** BOT-01 + BOT-05 catch the two most common Kingdom failure modes (silent SCRYER death, dead-letter accumulation). Both buildable in 35 minutes total.

---

## PART 3: Local Models + Specialty Bots Research

### Currently Installed in the Kingdom

```
moondream:1.8b    — 1.7GB  (INSTALLED — use for fast Goldfish pass)
gemma3:4b         — 3.3GB  (INSTALLED — underutilized, use for SCRYER summaries)
nomic-embed-text  — 274MB  (INSTALLED — embeddings for semantic search)
llava:latest      — 4.7GB  (REPLACE with moondream or qwen2.5vl:3b)
llama3:latest     — 4.7GB  (general — underutilized)
```

### Model Recommendations

| Model | Size | Use Case | Action |
|-------|------|---------|--------|
| `moondream:1.8b` | 1.7GB | Fast Goldfish screen labels (5min cadence) | Already installed |
| `gemma3:4b` | 3.3GB | SCRYER stream summarization, mission output analysis | Already installed — START USING |
| `qwen2.5vl:3b` | ~2.5GB | High-quality Herald narration (30min cadence) | **`ollama pull qwen2.5vl:3b`** |
| `smollm2:360m` | 280MB | RAVEN priority classifier, routing decisions | **`ollama pull smollm2:360m`** |
| `phi4-mini` | ~2.5GB | Long-context stream summarizer (128K window) | **`ollama pull phi4-mini`** |
| `gemma3:1b` | 815MB | Fast yes/no classification, fine-tuning base | Pull when doing fine-tuning |

### Vision Models — Goldfish

**Current:** `llava:latest` — 4.7GB, slow, mediocre
**Immediate fix:** `moondream:1.8b` — already installed, sub-second, good for screen labels
**Quality fix:** `qwen2.5vl:3b` — best accuracy/size ratio for open-source vision in 2026, outperforms LLaVA and Moondream2 on captioning benchmarks

**Two-tier Goldfish pipeline:**
- Fast pass: `moondream:1.8b` every 5 min → TIMESTREAM screen label
- Quality pass: `qwen2.5vl:3b` every 30 min → full Herald narration

### Text Routing — RAVEN Priority Classifier

**Recommendation: `smollm2:360m`** — 280MB, ~500MB RAM, sub-second on Apple Silicon. Pass message SUBJECT + first 200 chars to `smollm2:360m` with classification prompt: `"Classify: URGENT/IMPORTANT/NORMAL/LOW"`. Override/confirm declared PRIORITY field before ingestion.

### Summarization — SCRYER Local

**gemma3:4b already installed.** SCRYER summaries are 500-2000 token inputs → 100-300 token outputs. Local inference: 3-8s per stream. 92 Gemini API calls/day → $0.

**For long streams:** `phi4-mini` — 128K context, fits entire day's stream data in one call. 2-4x faster than full Phi-4.

### Fine-Tuning via MLX — Kingdom Router

**Target:** Fine-tune `gemma3:1b` (815MB) on ~500 labeled examples: message text → routing decision.
**Tool:** MLX + `mlx_lm.lora`. Runs natively on Apple Silicon.
**Training time:** 1000 LoRA iterations on M2 Pro 16GB ≈ 8-15 minutes.
**GUI option:** M-Courtyard — macOS app, no code required.

```bash
pip install mlx-lm
python -m mlx_lm.lora \
  --model mlx-community/gemma-3-1b-instruct-4bit \
  --train \
  --data ./kingdom_routing_examples/ \
  --num-layers 4 \
  --iters 1000
```

---

## PART 4: Standard Bot Pattern — Kingdom Native

### Bot Anatomy (copy-paste template)

```bash
#!/bin/bash
# bot-name.sh — One sentence. One job.
# Trigger: [cron | QueueDirectory | WatchPaths | HTTP]
# Input:   [file | DB query | env]
# Output:  [file drop | SQLite INSERT | HTTP POST | RAVEN envelope]
set -euo pipefail

BOT_NAME="bot-name"
LOG="$HOME/Desktop/THE_FORGE/FORGE_CLAUDE/logs/bots/${BOT_NAME}.log"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] [$BOT_NAME] $*" >> "$LOG"; }

# Auto-report failures to FORGE
BOT_RESULT=0
trap 'if [ $BOT_RESULT -ne 0 ]; then
  printf "---\nTO: FORGE_CLAUDE\nFROM: %s\nSUBJECT: Bot failed (exit %s)\nPRIORITY: IMPORTANT\n---\n\nBot %s exited %s at %s\n" \
    "$BOT_NAME" "$BOT_RESULT" "$BOT_NAME" "$BOT_RESULT" "$(date)" \
    > "$HOME/Desktop/THE_FORGE/@FORGE_CLAUDE_MAILBOX/buffer/${BOT_NAME}_FAIL_$(date +%Y%m%d-%H%M%S).md"
fi' EXIT

# 1. Read input
# 2. One transformation
# 3. Write output
# 4. Log to TIMESTREAM

log "done"
exit 0
```

### Trigger Types

| Trigger | Mechanism | Use When |
|---------|-----------|---------|
| Time schedule | launchd `StartInterval` / `StartCalendarInterval` | Regular polling |
| File per-item | launchd `QueueDirectory` | Pipeline stages (one fire per file) |
| Dir change | launchd `WatchPaths` | RAVEN-style ingest |
| DB change | Polling bot checking timestamp column | Notify, spend spike |
| Manual | `bash bot-name.sh` | Debug, one-off |

**Note:** `QueueDirectory` is underused in the Kingdom. It fires once per file, passes file path as argv[1], guarantees each file processed exactly once. Foundation of the file-drop pipeline.

### Bot Chain Pattern

```
[Trigger: cron]
      ↓
stage-1.sh  → /tmp/kingdom_bots/stage1_{ts}.json   (QueueDirectory watches)
      ↓
stage-2.sh  → /tmp/kingdom_bots/stage2_{ts}.md     (QueueDirectory watches)
      ↓
stage-3.sh  → final output (RAVEN / Console NEWS / TIMESTREAM)
```

Each arrow = launchd `QueueDirectory`. Any stage retries independently. Pipeline inspectable at `/tmp/kingdom_bots/`.

### Kingdom-Native Checklist

A bot is Kingdom-native when:
1. **Logs to TIMESTREAM** — one INSERT per run
2. **Reports failures to FORGE** — trap handler drops RAVEN dead-letter on non-zero exit
3. **Uses the routing map** — FORGE_CLAUDE=infrastructure, AExGO=missions, AExMUSE=creative
4. **Respects Highlander Protocol** — one output file per domain, no accumulation

---

## Summary: Immediate Actions

| Priority | Action | Time |
|----------|--------|------|
| P0 | `ollama pull smollm2:360m` | 5 min |
| P0 | Build BOT-01 Dead Silence Bot (SCRYER watchdog) | 15 min |
| P0 | Build BOT-05 Mailbox Health Bot | 20 min |
| P1 | `ollama pull qwen2.5vl:3b` (replace llava in Goldfish) | 10 min |
| P1 | Build BOT-02 Spend Spike Bot | 20 min |
| P1 | Add failure trap to all new bots | 5 min/bot |
| P2 | `ollama pull phi4-mini` (local SCRYER summarization) | 10 min |
| P2 | Decompose pulse.sh into per-mission dispatch bots | 1 session |
| P3 | Fine-tune gemma3:1b Kingdom Router via MLX | 1 session |

**The single insight:** The Kingdom has all the data needed to detect every failure it's currently missing. It needs tiny watchdog bots that look at the data that already exists and scream when it goes wrong.

---

*Drone: FORGE_CLAUDE Sonnet | Session: 165 | Soulforge RESEARCH (background)*
← Back to Kingdom Wiki