Scout is the squad's read-only researcher — fanned out in parallel to gather from many sources at once, so the expensive judgment step is fed by cheap, wide local search instead of frontier tokens. It surfaces what it finds and changes nothing.

Scout

Researcher

Status: Active
Brain: Qwen3.6-27B (8-bit, MTP)

Measured performance

25.4tok/s

Throughput

1.27J/tok

Energy

32W

Power

Qwen3.6-27B-oQ8-mtp · measured 2026-06-09

From the harness

current brain: Qwen3.6-27B-oQ8-mtplocal — oMLX on the Mac
last run: 2026-07-02 · completed
runs, last 30 days: 60 fired · 49 completed

as of 2026-07-03 19:51 UTC · brain read from the harness dispatch map at deploy; runs from the workflow log

Capabilities

Research
4/5
Gathers and summarizes across many sources read-only; its job is to surface, never to edit.
Parallel fan-out
4/5
Batched into a swarm — multiple instances investigate concurrently, each blind to what the others find.
Local file reading
3/5
Reads the local tree (read, grep, list) to ground its research in the actual source rather than a summary of it.

Development

2026-06-07done
Given local file reading
Gained read, grep, and list over the local tree, grounding its research in real source instead of recall.
2026-06-08done
Ran in a swarm, end to end
A two-researcher swarm completed through the new workflow engine in about twelve seconds — the cheap, wide local search that feeds the judgment step.
Field note: workflow engine
2026-06-13done
Trialed DiffusionGemma — then reverted
Swapped onto the block-diffusion variant of the same 26B-A4B base — the model we ported into Apple's own mlx-lm — to test sub-two-second canvas turns for fan-out. Faster per turn, but multi-canvas research came back globally jumbled (the diffusion output-order problem), so we reverted the brain to autoregressive Gemma-4. Diffusion stays a speed dial for bounded work where global coherence matters less.
Field note: DiffusionGemma on the metal
2026-06-14done
Wider reach: the web and GitHub
Gained web fetch and GitHub file/tree reach (SSRF-guarded) so the swarm gathers beyond the local tree — sourced, real-URL synthesis instead of recall. The reach the role was always missing.
2026-06-18done
Swapped to QUEST-35B-RL, a deep-research agent
Swapped from Gemma-4-26B to QUEST-35B-RL - an RL-trained deep-research agent (Qwen3.5-35B-A3B, the same hybrid-attention family as the planner) that dropped into the MLX serving with zero porting. On a real research brief it pulled exact figures Gemma missed - but only once given a realistic budget and two harness tweaks: a 'consult a few sources, then stop and synthesize' nudge so it converges, and a strip of the answer-wrapper tags its training format emits.
Field note: the researcher that wouldn't stop
2026-06-26now
Swapped to Qwen3.6 for reliable synthesis
The QUEST deep-research brain over-searched without converging in the parallel swarm role, running a dozen searches and never synthesizing. Swapped onto Qwen3.6-27B, the same brain as the planner and verifier, which converges and returns a cited synthesis. QUEST did not leave the squad: it became Merlin, a dedicated deep-research and ingest agent on its own RL-trained harness.
Field note: the tool calls we were throwing away

From the harness

Capabilities

Research

Parallel fan-out

Local file reading

Development

Given local file reading

Ran in a swarm, end to end

Trialed DiffusionGemma — then reverted

Wider reach: the web and GitHub

Swapped to QUEST-35B-RL, a deep-research agent

Swapped to Qwen3.6 for reliable synthesis