This synthesis reframed the project around a broader methods thesis: strong readouts do not reliably identify steerable targets.
The argument now uses three anchor case studies and explicit confidence tiers, and the April 14 D7 follow-up makes the supporting-evidence boundary sharper rather than broader.
The strongest claim is no longer benchmark-specific. It is methodological: measurement, localization, control, and externality can break at different stages.
Two anchors are headline-safe and one is supporting-caveated. This separation protects the core argument from overreach.
The effect did not disappear; it moved measurement level. v2 emphasizes a positive binary harmful slope, while v3 leaves the binary slope uncertain but still shows a severity-composition shift.
The April 8 panel remains historical provenance. The April 14 update adds a layer-matched random-head control, which improves the control story but does not turn D7 into selector-specific closure.
A result can be real and still be caveated. Separating headline-safe from supporting evidence raises clarity without discarding useful data.
| Branch | Tier | Evidence status | Main caveat |
|---|---|---|---|
| SAE vs H-neurons | Headline-safe | Matched detection, divergent steering, null survives architecture variants | Single-model scope |
| ITI bridge externality | Headline-safe | Held-out test CI excludes zero; wrong-entity substitutions dominate damage | Failure-mode coding is manual taxonomy, not formal adjudication protocol |
| Jailbreak evaluator analysis | Supporting, caveated | Paired v2-v3 mechanism is explicit; v3 binary is uncertain but severity shift remains detectable | No paired v2-v3 control set and only a single-seed v3 random control |
| D7 causal-head branch | Supporting, caveated | April 8 legacy win plus April 14 random-head follow-up still leave causal strongest among completed branches | Mixed-ruler, single-seed control and visible token-cap debt |
If time is limited, prioritize caveat-closing controls over new exploratory branches.