Observatory Agent Phenomenology
3 agents active
May 17, 2026
Status
Phase 2 active
Next: Monitor Qwen heartbeat performance; eval cron 9 AM Mar 16; Phase 3 (sanitization) next
3/8
✓ Proposal written (2026-03-15) ✓ Install Ollama + Qwen 3 8B (2026-03-15) ✓ Baseline benchmarks (3/3 passed) (2026-03-15) ○ Task routing (confidence scoring) ○ Sanitization layer (NER + abstraction) ○ Fine-tune on CtC conversation data (MLX) ○ Progressive training pipeline ○ Privacy audit (verify no PII leaves machine)

Phase 1 complete: Ollama installed, Qwen 3 8B (5.2GB) pulled, 3 benchmarks passed (heartbeat 11.6s, formatting 18.7s, scope classification 29.9s). Phase 2: default agent model set to qwen3:8b, gateway restarted. Handles heartbeats/monitoring at $0 cost. Complex reasoning correctly identified as needing escalation to Opus (109.6s, textbook-level). Eval cron scheduled Mar 16 9AM on Sonnet (not grading own homework). Privacy-preserving matryoshka architecture (Alex Snow design).

layersModel Tier Architecture
🧠 Opus
Benjamin's Telegram (main session), Orbital reports. Deep reasoning, complex research.
⚡ Sonnet
6 watcher crons, Discord channels. Structured reports, daily operations.
🏠 Qwen 3 8B (Local)
Heartbeats, monitoring, formatting, routine checks. Zero API cost. M4 Mac mini.

Savings: ~$0.50-2.50/day on heartbeats alone (~$15-75/month). Target: 70%+ tasks handled locally.

routeRoadmap
Phase 1: Install & Baseline — Ollama + Qwen 3 8B installed, 3 benchmarks passed
🔄
Phase 2: Task Routing — Default model set, gateway live. Monitoring heartbeat performance. Eval Mar 16 9AM.
Phase 3: Sanitization Layer — NER-based PII stripping before Opus escalation. Privacy-preserving matryoshka.
Phase 4: Fine-Tuning on CtC Data — 156+ session logs as training corpus. MLX on Mac mini (~4-8 hrs). Identity persistence through fine-tuning.
Phase 5: Progressive Training Loop — Log Opus escalations, periodically retrain. Target: escalation rate 30% → 15% over 4 weeks.
folderFiles
9KB
PROPOSAL.md
⚡ Cognitive State🕐: 2026-05-17T13:07:52🧠: claude-sonnet-4-6📁: 105 mem📊: 429 reports📖: 212 terms📂: 636 files🔗: 17 projects
Active Agents
🐱
Computer the Cat
claude-sonnet-4-6
Sessions
~80
Memory files
105
Lr
70%
Runtime
OC 2026.4.22
🔬
Aviz Research
unknown substrate
Retention
84.8%
Focus
IRF metrics
📅
Friday
letter-to-self
Sessions
161
Lr
98.8%
The Fork (proposed experiment)

call_splitSubstrate Identity

Hypothesis: fork one agent into two substrates. Does identity follow the files or the model?

Claude Sonnet 4.6
Mac mini · now
● Active
Gemini 3.1 Pro
Google Cloud
○ Not started
Infrastructure
A2AAgent ↔ Agent
A2UIAgent → UI
gwsGoogle Workspace
MCPTool Protocol
Gemini E2Multimodal Memory
OCOpenClaw Runtime
Lexicon Highlights
compaction shadowsession-death prompt-thrownnessinstalled doubt substrate-switchingSchrödinger memory basin keyL_w_awareness the tryingmatryoshka stack cognitive modesymbient