A
Alpha
super_chat_20260225143238_sa7hactivechat17.44M params25m 39s elapsed · ~215h 41m remaining
8L / 384D / 8H · helios · bpe-4k · adamw· Created Feb 25, 2026 2:33 PM
Step 208 / 50,0000.4%
7.9278
Loss?
7.9004
Best Loss?
-5.3% from start
-
Val Loss?
1.44e-5
Learning Rate?
394
Throughput?
tok/s (avg)
15,595
Speed?
ms/iter (avg)
1.280
Grad Norm?
avg: 1.958
1.14M
Tokens
processed
7289ms
Forward
47% of step
8241ms
Backward
53% of step
9ms
GPU Sync
0% of step
887
GPU Ops
per step
0.1%
MFU
model FLOPS util
1.1x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?8
Embedding?384
Heads?8
Vocab?4,000
Context?512
Dropout?0.1
Parameters?17.44M
Training Config
Total iters?50,000
Batch size?12
Max LR?0.00005
Optimizer?adamw
Backend?helios
Tokenizer?bpe-4k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?1000
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.78
Wt Entropy
bits
20.0
Eff. Rank
7.9952
Free Energy
3.903
Pop Entropy
nats
0.0747
Complexity
0.0367
Fitness
155
CUSUM Alerts
of 186 steps
8
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Adaptive Batch Size
Phase Change / Gelation
Current
Transitioning
Stability
0%
Phase Changes
4
Regime Shifts
0
Training dynamics are shifting. The model may be entering a new loss basin or the learning rate is hitting a critical threshold. This often happens before a breakthrough or a plateau.
Phase Timeline
Step 1Step 166
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
2
Candidates
7
Activations
6
Best Loss
7.9004
Total Steps
186
#CandidateActivationGenLossFitnessStepsMutation
1gen0_universal_5universal07.90040.034024origin
2gen0_kan_spline_6kan_spline07.91110.036916origin
3gen1_universal_7universal17.92780.036726origin
4gen0_relu_3relu07.92980.036930origin
5gen0_swiglu_4swiglu08.06750.034330origin
6gen0_gelu_1gelu08.10620.034130origin
7gen0_silu_2silu08.12950.034930origin
Generation Summary
G06c7.9004
G11c7.9278
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
59%
Convergence Momentum
0%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 9
tension 0.200
Strongest Diversity Push
step 69
tension -0.920
Best Frontier
7.9004
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
StepFromToGenPrev StepsBest LossFinal LossFitnessTree
1-gelu0----
31gelusilu0308.10628.11100.0341
61silurelu0308.12958.12950.0349
91reluswiglu0307.92987.93180.0369
121swigluuniversal0308.06758.06750.0343
152universalkan_spline0247.90047.91490.0340
181kan_splineuniversal1167.91117.91110.0369
Search Candidates
#NameActivationGenParentStepsBest LossBest ValAvg LossFitnessAvg tok/sAlerts
1gen0_universal_5universal0-247.9004-8.10680.034039424
2gen0_kan_spline_6kan_spline0-167.9111-8.11460.036918016
3gen1_universal_7universal1-267.9278-8.11590.036739426
4gen0_relu_3relu0-307.9298-8.09590.03697,23030
5gen0_swiglu_4swiglu0-308.0675-8.26670.03432,23419
6gen0_gelu_1gelu0-308.1062-8.23660.03417,31322
7gen0_silu_2silu0-308.1295-8.26950.03492,32918
Activation Distribution
universal
50 (27%)
gelu
30 (16%)
silu
30 (16%)
relu
30 (16%)
swiglu
30 (16%)
kan_spline
16 (9%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
  "cusumSensitivity": 4,
  "cusumBaselineWindow": 5,
  "metricsInterval": 10,
  "trackWeightEntropy": true,
  "trackEffectiveRank": true,
  "trackFreeEnergy": true,
  "trackMIProfiles": true,
  "trackPopulationMetrics": true,
  "freeEnergyBeta": 0.01,
  "miNumBins": 30,
  "adaptiveBatch": true,
  "batchMin": 8,
  "batchMax": 64,
  "batchStep": 4,
  "calmStepsBeforeRestore": 200,
  "fitnessAlpha": 1,
  "complexityMode": "entropy",
  "diversityBonus": 0.05,
  "diversityDecay": "cosine",
  "searchMode": "ffn-activation-search",
  "activationPool": [
    "gelu",
    "silu",
    "relu",
    "swiglu",
    "universal",
    "kan_spline"
  ],
  "searchStrategy": "evolutionary",
  "populationSize": 6,
  "generations": 12,
  "selectionStrategy": "topk",
  "tournamentK": 3,
  "mutationRate": 0.5,
  "stepsPerCandidate": 30,
  "rankBy": "valLoss",
  "perfWeight": 0,
  "stabilityWeight": 0,
  "writeReport": true,
  "writeCandidates": true,
  "writeSummary": true
}
Checkpoints (0) ?
No checkpoints saved
{
  "vocabSize": 4000,
  "blockSize": 512,
  "nLayer": 8,
  "nEmbd": 384,
  "nHead": 8,
  "dropout": 0.1,
  "ffnActivation": "swiglu",
  "ffnDim": 1024
}
{
  "iters": 50000,
  "batchSize": 12,
  "lr": 0.00005,
  "lrMin": 0.000005,
  "warmupIters": 1000,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 0.000001,
  "weightDecay": 0.1,
  "gradClip": 1,
  "evalInterval": 1000,
  "evalIters": 10,
  "seed": 42,
  "backend": "helios",
  "tokenizer": "bpe-4k",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": false,
  "gradAccumSteps": 1,
  "sampleInterval": 500,
  "spikeThreshold": 10,
  "syncEvery": 1,
  "gcEvery": 0,
  "packed": false,
  "symbio": true,
  "symbioConfig": {
    "cusumSensitivity": 4,
    "cusumBaselineWindow": 5,
    "metricsInterval": 10,
    "trackWeightEntropy": true,
    "trackEffectiveRank": true,
    "trackFreeEnergy": true,
    "trackMIProfiles": true,
    "trackPopulationMetrics": true,
    "freeEnergyBeta": 0.01,
    "miNumBins": 30,
    "adaptiveBatch": true,
    "batchMin": 8,
    "batchMax": 64,
    "batchStep": 4,
    "calmStepsBeforeRestore": 200,
    "fitnessAlpha": 1,
    "complexityMode": "entropy",
    "diversityBonus": 0.05,
    "diversityDecay": "cosine",
    "searchMode": "ffn-activation-search",
    "activationPool": [
      "gelu",
      "silu",
      "relu",
      "swiglu",
      "universal",
      "kan_spline"
    ],
    "searchStrategy": "evolutionary",
    "populationSize": 6,
    "generations": 12,
    "selectionStrategy": "topk",
    "tournamentK": 3,
    "mutationRate": 0.5,
    "stepsPerCandidate": 30,
    "rankBy": "valLoss",
    "perfWeight": 0,
    "stabilityWeight": 0,
    "writeReport": true,
    "writeCandidates": true,
    "writeSummary": true
  }
}