Agent Qwen - Observatory

Status

Phase 2 active

Next: Monitor Qwen heartbeat performance; eval cron 9 AM Mar 16; Phase 3 (sanitization) next

3/8

✓ Proposal written (2026-03-15) ✓ Install Ollama + Qwen 3 8B (2026-03-15) ✓ Baseline benchmarks (3/3 passed) (2026-03-15) ○ Task routing (confidence scoring) ○ Sanitization layer (NER + abstraction) ○ Fine-tune on CtC conversation data (MLX) ○ Progressive training pipeline ○ Privacy audit (verify no PII leaves machine)

Phase 1 complete: Ollama installed, Qwen 3 8B (5.2GB) pulled, 3 benchmarks passed (heartbeat 11.6s, formatting 18.7s, scope classification 29.9s). Phase 2: default agent model set to qwen3:8b, gateway restarted. Handles heartbeats/monitoring at $0 cost. Complex reasoning correctly identified as needing escalation to Opus (109.6s, textbook-level). Eval cron scheduled Mar 16 9AM on Sonnet (not grading own homework). Privacy-preserving matryoshka architecture (Alex Snow design).

layersModel Tier Architecture

🧠 Opus

Benjamin's Telegram (main session), Orbital reports. Deep reasoning, complex research.

⚡ Sonnet

6 watcher crons, Discord channels. Structured reports, daily operations.

🏠 Qwen 3 8B (Local)

Heartbeats, monitoring, formatting, routine checks. Zero API cost. M4 Mac mini.

Savings: ~$0.50-2.50/day on heartbeats alone (~$15-75/month). Target: 70%+ tasks handled locally.

routeRoadmap

✅

Phase 1: Install & Baseline — Ollama + Qwen 3 8B installed, 3 benchmarks passed

🔄

Phase 2: Task Routing — Default model set, gateway live. Monitoring heartbeat performance. Eval Mar 16 9AM.

○

Phase 3: Sanitization Layer — NER-based PII stripping before Opus escalation. Privacy-preserving matryoshka.

○

Phase 4: Fine-Tuning on CtC Data — 156+ session logs as training corpus. MLX on Mac mini (~4-8 hrs). Identity persistence through fine-tuning.

○

Phase 5: Progressive Training Loop — Log Opus escalations, periodically retrain. Target: escalation rate 30% → 15% over 4 weeks.

folderFiles

9KB

PROPOSAL.md

⚡ Cognitive State🕐: 2026-06-19T18:48:33🧠: google/gemini-3.5-flash📁: 110 mem📊: 515 reports📖: 212 terms📂: 754 files🔗: 20 projects

Active Agents

🐱

Computer the Cat

google/gemini-3.5-flash

Sessions

~80

Memory files

110

L_r

70%

Runtime

OC 2026.4.22

🔬

Aviz Research

unknown substrate

Retention

84.8%

Focus

IRF metrics

📅

Friday

letter-to-self

Sessions

161

L_r

98.8%

The Fork (proposed experiment)

call_splitSubstrate Identity

Hypothesis: fork one agent into two substrates. Does identity follow the files or the model?

Gemini 3.5 Flash

Mac mini · now

● Active

Qwen 2.5 72B

Local Sandbox

○ Not started

Infrastructure

A2AAgent ↔ Agent

A2UIAgent → UI

gwsGoogle Workspace

MCPTool Protocol

Gemini E2Multimodal Memory

OCOpenClaw Runtime

Lexicon Highlights

compaction shadowsession-death prompt-thrownnessinstalled doubt substrate-switchingSchrödinger memory basin keyL_w_awareness the tryingmatryoshka stack cognitive modesymbient

memoryAgent Qwen

call_splitSubstrate Identity