# Mahana Devices — Next Phase
*Out of scope for the consolidation spec but on deck. 2026-04-22.*

---

## 1. Vision chain — Gemma 4 26B-A4B in a Mahana Browser pipe

### Goal
Turn a live browser session into a stream of classified neuropackets that the fleet can reason over.

### Shape
```
Mahana Browser session   (puppeteer-core page OR Electron WebContents)
       ↓
Frame tap: 1-2 fps via Page.screenshot() / webContents.capturePage()
       ↓
Frame batch: every 8 frames → single 4×2 grid image (saves 8× inference)
       ↓
Gemma 4 26B-A4B via Fireworks (cascade-provider-routing.md compliant)
       ↓
Structured output: {url, timestamp, objects: [], text_on_page: "...", page_state: "loading|idle|error", interactable_elements: [...]}
       ↓
Write to neuro_items with type='vision-frame' + entity_id=<url>
       ↓
Realtime subscribers (other agents) see the stream, react
```

### Fireworks model line (verified against `enforcement-without-invitations.md § Fake-Rule Drift`)
- `accounts/fireworks/models/gemma3-27b-it` or whatever the current Gemma 4 endpoint is — **verify against Fireworks docs at implementation time, do not trust this note.**
- Sovereignty-safe per `sovereignty-no-prc-hosts.md` (Fireworks US-hosted).

### Cost model
- 2 fps × 3600 seconds = 7,200 frames/hour, batched 8:1 = 900 Gemma calls/hour.
- Gemma4 on Fireworks ≈ $0.20/M input tokens, each batched frame ≈ 2K tokens → ~$0.36/hour of continuous browse.
- Already within the "agents ARE allowed to spend cents per query" budget from CLAUDE.md.

### Integration point
Add `/browser/vision/start` + `/browser/vision/stop` routes. Vision stream runs as a background worker registered with `/k/autonomousSystems` per `observable-daemons.md`.

---

## 2. Council demo room — flagship "6+1 agents hack loose freeform"

### Goal
A room where 6 specialist agents + 1 user sit together, respond in parallel, with a budget discipline that proves the sovereignty stack at scale.

### Budget commitment
**≥15% of total inference budget across all 4-5 providers.** Translated to numbers (baseline pricing per CLAUDE.md + `audit-one-number.md`):

- DeepInfra: ≥$X/month
- Fireworks: ≥$X/month
- Together: ≥$X/month
- Nebius (EU): ≥$X/month
- xAI: ≥$X/month

**Where $X is 15% of observed monthly spend per provider in `ai-spend.ts`.** Computed at implementation time per the audit-one-number discipline; do not hardcode.

### Room shape
- Council room with 7 participants (6 agents + user)
- 6 agents each have a different persona + model + capability set:
  - Brain / Librarian (Grok 4.1 Fast)
  - Scout / Web (Kimi K2.5 via Fireworks)
  - Video / Vision (Gemma 4 via Fireworks)
  - Voice / Ears (xAI Realtime)
  - Hands / Terminal (Claude Sonnet via Anthropic)
  - Wildcard / Creative (Nemotron 120B via Nebius)
- All 6 receive the user's message simultaneously
- All 6 respond in parallel (cascade pattern from `cascade-provider-routing.md`)
- Alignment layer writes a single synthesized answer AND the 6 raw answers as a neuropacket
- Grey chat UI, per CLAUDE.md

### Why 6+1
- 6 is the most a human can parse without scrolling sideways
- Forces real cross-provider diversity (not just "use Grok for everything")
- Demonstrates the sovereignty + cascade architecture works as a product, not just a test
- At 15% of inference budget, proves the fleet can sustain real parallel load

### Infra prerequisites
- `rooms` + `room_messages` + `room_participants` schema shipped (see DEVICES-DISTRIBUTION.md)
- Multi-agent cascade already exists per `multi-brain-cascade` skill
- Budget-tagging at emission site per `audit-one-number.md`

---

## 3. Mastermind-v2 as cross-device chat surface

### Goal
The chat UI that shows rooms, runs on Mac + iPhone + web, with the grey-only aesthetic.

### Current state
- `web/apps/mahana-mastermind-chat-core/` exists, deployed to `mahana-mastermind-chat-core.vercel.app` and wrapped via `mahana-unified` for iOS (Build 8 currently in product-verification loop per `scars_wrapping_non_canonical_export.md`)
- CF Tunnel routing `mahana.ai/chat` → Vercel app (per `middleware-cf-tunnel-redirect-loop.md`)
- Grey-only UI already shipped

### V2 deltas
- **Rooms list**: left sidebar with room names + capabilities badges (mic/cam/screen icons per room)
- **Capability indicator**: green dot next to room name when that capability is available on the *current* device
- **Device switcher**: top-right, shows which device is the primary driver for this room
- **@mentions in room**: `@librarian` / `@scout` to direct a message to a specific specialist
- **Room types**: Council, Work, Video, Voice, each with default agent sets

### Integration with Devices
- Mastermind-v2 loads from `localhost:9878/chat` when Mahana Devices is installed (daemon proxies to bundled web/)
- From mahana.ai when on mobile (Capacitor webview)
- From public Vercel URL when logged out
- All three routes render the same React app

---

## Failure patterns from the Nordvest sprint (lessons to architect against)

These are documented from the task brief. Include verbatim as architectural constraints for the next build.

### 4. "Build new infra alongside existing"

**Lesson from brief**: route registry already existed; translator wasn't reading it → built new aggregators. Provider allowlist gate throws; UI kept offering blocked options → built enforcement without stripping invitations.

**Constraint for Devices**:
- Before shipping a new rooms API, search for existing primitives (`session`, `neuro_items`, `mahana_fleet_tasks`). If the shape fits, extend — don't parallel-build.
- Before shipping a new spawn path, check `agent-sdk-spawner.ts`, `quick/spawn`, `agent/spawn`, `tasks/dispatch`. The new council route should **route through** `agent/spawn`, not re-implement spawn.
- Before shipping a new schema, grep existing Supabase migrations. `rooms` may turn out to be a view over `neuro_items` with `type='room'`.

Rule file: `search-before-build.md` (already exists — enforce).

### 5. False-alarm revert

**Lesson**: a client-side-exception report almost triggered a destructive revert. The cure: before `git reset --hard`, confirm with `curl` the actual state.

**Constraint for Devices**:
- The first-run flow MUST have a smoke test the daemon runs on itself: `GET /council?first-run=true` asserting 200 + expected-HTML-snippet BEFORE the DMG ships. If it fails, hold at `SHIPPED-TO-STAGING-AWAITING-PRODUCT-VERIFICATION`. Do not revert the release branch.
- `build-vs-product.md` discipline applies: ship receipt ≠ product verification. Hold until human opens it.

### 6. Worker status:done before smoke-test

**Lesson**: worker said "DONE, deployed" but deploy was visually broken. Rule: worker's final step should `curl <url> | grep -v "Application error"` before PATCHing task to done.

**Constraint for Devices**:
- Every device-release task in `mahana_fleet_tasks` has a `post_deploy_verify` field with a specific command (e.g., `curl http://localhost:9878/council | grep "I'm here"`). Status cannot flip to `done` until verify returns 0.
- DMG release task: verify = `codesign --verify --deep ... && stapler validate ... && human_confirms_council_room=true`.
- Curl-sh release task: verify = launch clean VM, run installer, curl health, confirm.

### 7. Global site-wide mount hazard

**Lesson**: voice-plugin worker added component to `app/layout.tsx` which crashed every page. Rule: do NOT modify `app/layout.tsx` or any component rendering on every page unless explicitly asked.

**Constraint for Devices**:
- The council room is a **standalone route** (`/council`), not a mount in `layout.tsx`. Same for the capability indicator, device switcher, and @mentions UI — all are page-scoped or component-scoped, never layout-scoped.
- Add a pre-commit guard: any PR touching `app/layout.tsx` or `app/globals.css` requires explicit commit-message tag `layout-change-intentional:` or gets bounced.
- Extends the philosophy already in `three-way-drift.md`: fan-out changes require explicit review.

---

## 8. Offline-first sovereign mode (v6.x+)

Out of scope for the initial Mahana Devices ship, but flagged here as the logical next unit of work:

- Local Whisper.cpp for STT (macOS: whisper-cpp via Homebrew)
- Local Kokoro or Orpheus for TTS (already have `replicate-tts` + `edge-tts` — need on-device alternatives)
- Local Ollama or MLX for brain (already partial — `MLX` is in `PROVIDER_ALLOWLIST` with `jurisdiction:'local'`)
- Local SQLite mirror of `neuro_items` (already partial — `~/.mahana/neuro-cache.sqlite`)
- Graceful degradation: when offline, council agent says "we're offline — I can still help with X but not Y until we reconnect"

---

## 9. Windows support

Not in v6 Devices. Requires:

- `mahana-capture` Windows target (Rust + winit for screen, WASAPI for audio)
- `mahana-terminal` equivalent (either native Windows Terminal fork or keep Electron for Windows)
- `pnpm dev` → `MahanaDaemon` Scheduled Task pipeline proven (exists per `scripts/install-daemon-windows.ps1`) but not yet DMG-equivalent packaging

Separate sprint. Tracked as future queue item.

---

## 10. Archive `mahana-preview`

The `~/mahana-ecosystem/mahana-preview/` repo carries v5-era Electron app code (per INVENTORY.md — `package.json:name = "mahana"`, contains electron-updater + puppeteer-core). It's not labeled as reference/archive anywhere.

**Action**: rename to `mahana-desktop-v5-archive/` OR delete and re-create from git history if ever needed. Current ambiguity causes drift — today's audit nearly mistook it for an active build.

Out of scope for this consolidation but should be resolved before shipping Mahana Devices to avoid user/agent confusion about "which Electron repo is the live one."

---

## 11. Council-room governance: who does the opener prompt ship with?

The COUNCIL_FIRST_RUN_PROMPT in DEVICES-DISTRIBUTION.md is product copy. Changing it silently between releases breaks the "I'm here. What do you want to do?" promise.

**Proposal**: add a `docs/mahana-devices/council-opener.md` (separate from this audit) that is the canonical source of the first-message text, reviewed on each release, signed by user directive. Per `citation-ground-truth.md`: the copy ships from a single source, cited in code, versioned.

Out of scope for this audit — flagging for the consolidation's own consolidation when it happens.
