Week of 18 May 2026 · Hypothesis of the week

If the value of Bet A is real, the launch cohort comes back.

A Thu 14 May repo investigation surfaced that the "Model Router" — Bet A's most public value claim — doesn't behave as marketed. Users can't pick a model and don't see one being picked. So this week's test of Bet A lives downstream of the invisible feature: do they come back?

Bet under test
Bet A — Agents hub
Layer
1a — Hub-value (retention proxy)
Decision by
Fri 22 May, 17:00 PDT
Cohort
Launch users (Thu 14 May+)
The claim

Retention as a substitute for invisible value.

Bet A's value is mediated by the router, which the user doesn't see. The thing they can vote with is their feet — if the router's invisible choices serve them, they come back.

Hypothesis
At least 60% of launch-cohort users return for a second session within 7 days of first signup.
Why retention, not direct interaction: the user can't invoke the router, only experience its output. Retention is the cleanest user-side proxy when the feature is invisible.
Decision rule

What each outcome means, defined Monday before the result is known.

Three branches. The rule applies on Friday regardless of which we hope for.

≥ 60%
Router earns return visits. Sustain UX. Deepen Layer 1a interviews with returners on what brought you back.
30 – 59%
Mixed. Investigate first-session: where non-returners drop off, what returners used, whether a router output visibly failed. Triage before treating as a Bet A signal.
< 30%
Router isn't earning return visits, or onboarding fails. Either way Bet A's value claim is undermined. Escalate to the triangulation gate.
Wired deliverables

The work that has to land for the test to run.

Adjacent feature work excluded. If any of these slip, Friday's verdict is "Didn't test". Status hand-typed; refreshed daily Mon–Fri.

Ticket Status Owner Wired because
ENG-73
Crawl 37
A1 measurement scaffolding
In Progress David Defines agent_session_started event, launch-cohort filter, 7-day return-rate Insight. Without it, hypothesis unresolvable. Slipped Fri 16 May; PostHog instrumentation ready, dashboard creation in progress (Tue 19 May standup).
ENG-50
Crawl 17
PostHog identity binding
Ready for QA David posthog.identify(keycloak_user_id) is the precondition for the launch-cohort filter — no identity, no cohort. Frontend done: mdb-ai#1275 merged + QA passed 15 May. OpenClaw identity bridge: mdb-ai#1284 merged Tue 19 May (bootstrap URL identity bridge). WS frames follow-up: anton_services#61 open.
ENG-67
Crawl 22
MindsHub launch + auth ramp
In Progress Lucas + Hamish Launched Thu 14 May; cohort precondition. Production code freeze Mon–Wed noon; staging env (ENG-91) not yet set up — anton_services#62 open (Hamish).

Last refreshed: Tue 19 May 2026, EOD PDT · hand-typed from Linear + Engineering Daily standup.

Expected signal

What data resolves the hypothesis.

Measurement (per Crawl 37 / ENG-73)
PostHog Insight "Launch-cohort 7-day return rate" — within the cohort, distinct users with ≥2 agent_session_started events (≥4h gap, within 7 days) divided by distinct users with ≥1. Read Fri 22 May 17:00 PDT.

Definitions locked Monday

Sample-size note

Cohort size is the limit (~80–100 expected). The 60% threshold is directional, not statistical. Friday's verdict reads "does the cohort behave broadly as Bet A predicts?" — not a 95% CI claim.

What gets a "Didn't test" verdict

Acknowledgement

Founder commitment to the decision rule.

Captured in the Monday call. Acknowledging = committing to apply Friday's verdict per the rule above, not renegotiating the threshold after the result.

Friday verdict

Verdict + decision applied (or blocker named).

Populated end-of-day Fri 22 May. Three possible verdicts.

To be filled in: Fri 22 May 2026, EOD PDT.
Learned Inconclusive Didn't test

If Learned — ledger entry added, decision rule applied (or queued for next planning).
If Inconclusive — name what would make it conclusive in week 2.
If Didn't test — owner of the slipped deliverable writes the entry (why, what changes).