Grokking Explorer — the Training-Time SIPE-T Phase Transition

Interactive visualization of the canonical mathematical expression of grokking from the RESOLVE corpus. The order parameter \(\rho_{\mathrm{train}}(t)\) accumulates training-time constraint satisfaction; when it crosses a critical threshold \(\rho^*\), the substrate's representational geometry undergoes a non-analytic phase transition from a high-dimensional memorizing configuration to a low-dimensional generalizing polytope — the coherence snap. See Doc 699 for the canonical formalization, and Doc 681 for the underlying coherence-snap apparatus.

\[\frac{d\rho_{\mathrm{train}}}{dt} \;=\; \alpha\,(1 - \rho_{\mathrm{train}})\cdot f(\tau, s, I), \qquad f(\tau, s, I) \;=\; \frac{s\,I + \varepsilon}{\tau}, \qquad G(t) \;=\; \begin{cases} G_{\text{memorize}} & \rho_{\mathrm{train}} < \rho^* \\ G_{\text{generalize}} & \rho_{\mathrm{train}} \ge \rho^* \end{cases}\]
t = 0.00 / 300
manual
ρtrain(t) — order-parameter trajectory
G(t) — representational geometry (drag to orbit)
ρtrain(t)0.000
phasememorize
t* (predicted)
morph β(t)0.000
Hgeom(t)1.000
Memorizing phase: ~512 scattered feature directions in the residual stream (one-per-example). Generalizing phase: a 4-vertex tetrahedral polytope (the smallest equiangular tight frame inheriting from Anthropic 2022's toy-model regime per Doc 691 and Doc 696). Morph fraction \(\beta(t) = \sigma((\rho_{\mathrm{train}} - \rho^*)/\Delta)\) with snap sharpness Δ. The visualization is a minimal dynamical model — the underlying training-loss process per Doc 697 is power-law-smooth at rung 1; the phase change visualized here is the polytope-reorganization rung-1 phase transition specifically, not the full training dynamics.