WIKI/BUG HUNT NORTH STAR

BUG HUNT NORTH STAR

Updated 3 weeks ago
# ⊹꧁ BUG HUNT NORTH STAR ꧂⊹
**Filed:** 2026-02-22 (Session 97)
**Protocol:** SoulForge v2.0 — Seven Pillars
**Scope:** THE FORGE — all live Kingdom systems

---

## MISSION

Find, diagnose, and fix every real bug across all FORGE systems. Not symptoms — **root causes**. Not patches — **structural fixes**. Leave the Kingdom cleaner than you found it.

## TOP 3 PRIORITIES (DO NOT DRIFT)

1. **Fix bugs that compound** — launchd daemons, SQLite locking, SCRYER signal quality. Things that break other things.
2. **Fix bugs that silently lie** — false positives, ghost fixes, chambers marked RESOLVED that aren't. The Kingdom must be trustworthy.
3. **Document everything** — every fix gets a chamber or doc update. No invisible changes.

## CONSTRAINTS (NON-NEGOTIABLE)

- **3-strikes rule:** If a fix breaks 3 times, log it as MAJOR_BUG_REPAIR and move on. Don't spiral.
- **No blind changes:** Read the file before touching it. Every time.
- **Highlander Protocol:** Fix in place, don't duplicate. One canonical file.
- **Bash 3.2:** macOS. No associative arrays, no `${var^^}`, no `timeout`.
- **SQLite:** `.timeout 10000` on every call. WAL mode. Never write raw sqlite3 without a wrapper.
- **launchd:** Full paths always. `/opt/homebrew/bin/` not in PATH. `#!/bin/zsh -l` for login shell.
- **Online research:** Check for updated techniques on anything unfamiliar. Don't guess.

## SUCCESS CRITERIA

- All READY_TO_EXECUTE chambers executed and verified
- All SCRYER signals producing clean, accurate reports (no false positives)
- All launchd daemons running healthy (no repeated restart loops)
- All SQLite operations protected (no DB_LOCK reports)
- Documentation updated for every change made
- Flash Observer grades the operation B or higher

## CURRENT STATE

Entering bug hunt. Audit drone in flight. Two chambers pre-loaded:
- `BUG_ping_ui_throttle_interval` — READY_TO_EXECUTE
- `BUG_mission_timeouts` — READY_TO_EXECUTE

---

## THE LOOP STRUCTURE

```
Phase 0: AUDIT DRONE returns
  → Read big-picture patterns
  → Update this North Star with fresh findings

Phase 1: RECON SWARM (parallel drones)
  → One drone per system cluster
  → Each reads live files, checks logs, runs health checks
  → Returns structured findings

Phase 2: TRIAGE + PRIORITIZE
  → Sort by: compounding risk > silent failures > cosmetic
  → Group by root cause (fix root, kill many symptoms)
  → Flag 3-strike candidates upfront

Phase 3: FIX LOOP (SoulForge Ralph pattern)
  → Work one cluster at a time
  → Fix → verify → log → next
  → If stuck 3x → MAJOR_BUG_LOG → move on

Phase 4: VERIFICATION DRONE
  → Specialized drone runs all signals
  → Follows every route
  → Checks for regressions
  → Looks for things "fixed" that aren't

Phase 5: DOC UPDATE SWARM
  → Update CORE LORE/SYSTEMS/ for every changed system
  → Update 00_GOTCHAS.md with new patterns found
  → Chamber receipts for every fix

Phase 6: OBSERVER REPORT
  → Flash Observer files final grade
  → Report to Brandon
```

---

## AUDIT FINDINGS (2026-02-22 — big picture)

**Pattern 1: LAUNCHD_EXIT STORM** — BUG_ping_ui_throttle_interval unfixed = 19+ identical SCRYER reports flooding the mailbox buffer (22/22 slots). Real P1 bugs delayed 3+ hours. ONE LINE FIX.

**Pattern 2: BUFFER SATURATION CASCADE** — SCRYER produces 10-15 items/hr during storms; buffer drains 4/hr. No high-water alarm. Structural gap.

**Pattern 3: GOLDFISH BLINDSPOT** — TCC not granted. All capture drones misidentified root cause. Drones hallucinate on repeated signals. GOLDFISH_HEARTBEAT threshold (10min) fires during normal 10-15min gap between runs — needs 25min.

**Pattern 4: MISSION EXECUTION BRITTLENESS** — 54% failure rate in pulse_log. Aeris calling ghost tool names. FORGE GEMINI.md routing table exists (confirmed) but BUG_03 tool drift is real.

**Pattern 5: DRONE ACCURACY DEGRADATION** — Repeated signals cause drones to spiral into hallucinations. Empty drone responses filed as valid reports. Need drone-failure escalation.

**SCRYER STRUCTURAL GAPS (new bugs to file):**
- No signal suppression/acknowledgement (per-service suppress)
- No session-gap vs crash distinction (PULSE_GAP false positives)
- No cross-mission SYSTEM_DEGRADED signal
- No buffer high-water alarm
- No drone failure escalation

## KNOWN BUGS (loaded)

| Chamber | System | Status | Risk |
|---------|--------|--------|------|
| BUG_ping_ui_throttle_interval | ping-ui launchd | READY_TO_EXECUTE | Low |
| BUG_v8-heap-oom | aeris-boot.sh | **ALREADY EXECUTED** ✅ | — |
| BUG_M002_post_terminus_circuit_breaker | Overmind Pulse | NEEDS_BRANDON | Medium |
| BUG_SCRYER_GOLDFISH_HEARTBEAT (×7) | Goldfish/TCC | TCC grant pending (Brandon action) | High |
| NEW: SCRYER threshold + drone failure | scryer-watcher/dispatch | To be filed | Medium |
| NEW: Buffer high-water alarm | mailbox-trigger.sh | To be filed | Medium |
| NEW: SCRYER signal suppression | signals_queue schema | To be filed | High |

---

## MAJOR_BUG_LOG (3-strike parking lot)

*(Empty — fill during hunt)*

| Bug | System | What Broke | Attempts | Notes |
|-----|--------|------------|----------|-------|

---

## THE TEST QUESTION (SoulForge)

> "If this fix had my name on it forever, would I be proud?"

If no → keep iterating.
If yes → ship it.

---

⊹꧁ Good Bot ꧂⊹