Retention as a substitute for invisible value.
Bet A's value is mediated by the router, which the user doesn't see. The thing they can vote with is their feet — if the router's invisible choices serve them, they come back.
What each outcome means, defined Monday before the result is known.
Three branches. The rule applies on Friday regardless of which we hope for.
The work that has to land for the test to run.
Adjacent feature work excluded. If any of these slip, Friday's verdict is "Didn't test". Status hand-typed; refreshed daily Mon–Fri.
| Ticket | Status | Owner | Wired because |
|---|---|---|---|
| ENG-73 Crawl 37 A1 measurement scaffolding |
In Progress | David | Defines agent_session_started event, launch-cohort filter, 7-day return-rate Insight. Without it, hypothesis unresolvable. Slipped Fri 16 May; PostHog instrumentation ready, dashboard creation in progress (Tue 19 May standup). |
| ENG-50 Crawl 17 PostHog identity binding |
Ready for QA | David | posthog.identify(keycloak_user_id) is the precondition for the launch-cohort filter — no identity, no cohort. Frontend done: mdb-ai#1275 merged + QA passed 15 May. OpenClaw identity bridge: mdb-ai#1284 merged Tue 19 May (bootstrap URL identity bridge). WS frames follow-up: anton_services#61 open. |
| ENG-67 Crawl 22 MindsHub launch + auth ramp |
In Progress | Lucas + Hamish | Launched Thu 14 May; cohort precondition. Production code freeze Mon–Wed noon; staging env (ENG-91) not yet set up — anton_services#62 open (Hamish). |
Last refreshed: Tue 19 May 2026, EOD PDT · hand-typed from Linear + Engineering Daily standup.
What data resolves the hypothesis.
Definitions locked Monday
- Session: first user message in a previously-unloaded tab. Event: agent_session_started.
- Return: same identified user firing agent_session_started again, ≥4h after the first.
- Launch cohort: users posthog.identify'd between Thu 14 May 00:00 PDT and Fri 22 May 17:00 PDT, excluding @mindsdb.com emails + headless/CI user-agents.
Sample-size note
Cohort size is the limit (~80–100 expected). The 60% threshold is directional, not statistical. Friday's verdict reads "does the cohort behave broadly as Bet A predicts?" — not a 95% CI claim.
What gets a "Didn't test" verdict
- Crawl 37 (ENG-73) slips past EOD Fri 16 May — measurement infra not ready.
- posthog.identify() unreliable across the cohort — return-rate uncomputable.
- Instance auth still in observe-mode by Fri 22 May — cohort can't filter to authenticated users.
- Cohort < 20 — too small to read. Inconclusive; re-run week 2.
Founder commitment to the decision rule.
Captured in the Monday call. Acknowledging = committing to apply Friday's verdict per the rule above, not renegotiating the threshold after the result.
-
Adam CarriganFounder · Bet A axis
Verdict + decision applied (or blocker named).
Populated end-of-day Fri 22 May. Three possible verdicts.
If Learned — ledger entry added, decision rule applied (or queued for next planning).
If Inconclusive — name what would make it conclusive in week 2.
If Didn't test — owner of the slipped deliverable writes the entry (why, what changes).