A
Alpha
super_chat_20260225205047_befdactivechat17.44M params10m 35s elapsed · ~59h 3m remaining
8L / 384D / 8H · helios · bpe-4k · adamw· Created Feb 25, 2026 8:51 PM
Step 266 / 50,0000.5%
6.9074
Loss?
6.9074
Best Loss?
-17.6% from start
-
Val Loss?
1.70e-5
Learning Rate?
3,326
Throughput?
tok/s (avg)
4,275
Speed?
ms/iter (avg)
0.863
Grad Norm?
avg: 1.102
2.72M
Tokens
processed
336ms
Forward
8% of step
3890ms
Backward
91% of step
11ms
GPU Sync
0% of step
861
GPU Ops
per step
0.8%
MFU
model FLOPS util
11.6x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?8
Embedding?384
Heads?8
Vocab?4,000
Context?512
Dropout?0.1
Parameters?17.44M
Training Config
Total iters?50,000
Batch size?20
Max LR?0.00005
Optimizer?adamw
Backend?helios
Tokenizer?bpe-4k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?500
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.80
Wt Entropy
bits
20.0
Eff. Rank
7.0142
Free Energy
3.909
Pop Entropy
nats
0.0756
Complexity
0.0495
Fitness
260
CUSUM Alerts
of 266 steps
-
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Phase Change / Gelation
Current
Transitioning
Stability
0%
Phase Changes
6
Regime Shifts
0
Training dynamics are shifting. The model may be entering a new loss basin or the learning rate is hitting a critical threshold. This often happens before a breakthrough or a plateau.
Phase Timeline
Step 1Step 251
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
2
Candidates
11
Activations
6
Best Loss
6.9074
Total Steps
266
#CandidateActivationGenLossFitnessStepsMutation
10.76·silu+0.24·id-Alpha.10.76·silu+0.24·id16.90740.049516inject_residual
2relu-Beta.1relu17.01170.048625swap_basis
3silu-Alpha.1silu17.15450.047025clone
4gelu-Thetagelu07.29550.044325origin
5relu-Etarelu07.40890.042725origin
6silu-Zetasilu07.53720.041425origin
7sq-Epsilonsq07.66160.039325origin
8id-Deltaid07.72770.038625origin
9gelu-Gammagelu07.81950.037325origin
10relu-Betarelu07.93630.036025origin
11silu-Alphasilu08.13920.033025origin
Generation Summary
G08c7.2955
G13c6.9074
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Converging
Diversity Pressure
66%
Convergence Momentum
100%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 22
tension 0.500
Strongest Diversity Push
step 1
tension -0.500
Best Frontier
6.9074
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
StepFromToGenPrev StepsBest LossFinal LossFitnessTree
1-silu0----
26silurelu0258.13928.13920.0330
51relugelu0257.93637.96130.0360
76geluid0257.81957.82240.0373
101idsq0257.72777.76680.0386
126sqsilu0257.66167.66160.0393
151silurelu0257.53727.54900.0414
176relugelu0257.40897.42670.0427
201gelusilu1257.29557.34390.0443
226silurelu1257.15457.16590.0470
251relu0.76·silu+0.24·id1257.01177.05570.0486
Search Candidates
#NameActivationGenParentStepsBest LossBest ValAvg LossFitnessAvg tok/sAlerts
10.76·silu+0.24·id-Alpha.10.76·silu+0.24·id1silu-Alpha166.9074-7.04830.04952,03116
2relu-Beta.1relu1relu-Beta257.0117-7.10290.04868,24825
3silu-Alpha.1silu1silu-Alpha257.1545-7.22500.04702,26125
4gelu-Thetagelu0-257.2955-7.37970.04438,26725
5relu-Etarelu0-257.4089-7.49050.04278,19825
6silu-Zetasilu0-257.5372-7.60650.04142,32825
7sq-Epsilonsq0-257.6616-7.77280.03938,04625
8id-Deltaid0-257.7277-7.80130.03868,36025
9gelu-Gammagelu0-257.8195-7.89260.03738,12025
10relu-Betarelu0-257.9363-7.99030.03608,18425
11silu-Alphasilu0-258.1392-8.27240.03302,29519
Activation Distribution
silu
75 (28%)
relu
75 (28%)
gelu
50 (19%)
id
25 (9%)
sq
25 (9%)
0.76·silu+0.24·id
16 (6%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
  "cusumSensitivity": 4,
  "cusumBaselineWindow": 5,
  "metricsInterval": 10,
  "trackWeightEntropy": true,
  "trackEffectiveRank": true,
  "trackFreeEnergy": true,
  "trackMIProfiles": false,
  "trackPopulationMetrics": true,
  "freeEnergyBeta": 0.01,
  "miNumBins": 30,
  "adaptiveBatch": false,
  "batchMin": 8,
  "batchMax": 64,
  "batchStep": 4,
  "calmStepsBeforeRestore": 200,
  "populationAdaptation": true,
  "populationScaleMin": 0.5,
  "populationScaleMax": 2,
  "populationScaleStep": 0.125,
  "populationAdaptationCooldown": 10,
  "mutationRateMin": 0.2,
  "mutationRateMax": 0.95,
  "fitnessAlpha": 1,
  "complexityMode": "entropy",
  "diversityBonus": 0.1,
  "diversityDecay": "cosine",
  "searchMode": "composed-activation-search",
  "activationPool": [
    "gelu",
    "relu",
    "silu",
    "swiglu",
    "universal",
    "kan_spline"
  ],
  "searchStrategy": "evolutionary",
  "populationSize": 8,
  "generations": 250,
  "selectionStrategy": "topk",
  "tournamentK": 3,
  "mutationRate": 0.7,
  "stepsPerCandidate": 25,
  "rankBy": "valLoss",
  "perfWeight": 0,
  "stabilityWeight": 0,
  "preserveWeightsAcrossCandidates": true,
  "carryOptimizerStateAcrossCandidates": true,
  "constantFfnDimAcrossCandidates": true,
  "fuseWeightsEachStep": true,
  "fusionShadowEma": 0.02,
  "fusionBaseStrength": 0.0015,
  "fusionMaxStrength": 0.02,
  "kuramotoCoupling": 0.7,
  "kuramotoDt": 0.1,
  "kuramotoDamping": 0.05,
  "writeReport": true,
  "writeCandidates": true,
  "writeSummary": true,
  "basisPool": [
    "silu",
    "relu",
    "gelu",
    "identity",
    "square"
  ],
  "maxGraphDepth": 4,
  "maxGraphNodes": 10
}
Checkpoints (0) ?
No checkpoints saved
{
  "vocabSize": 4000,
  "blockSize": 512,
  "nLayer": 8,
  "nEmbd": 384,
  "nHead": 8,
  "dropout": 0.1,
  "ffnActivation": "swiglu",
  "ffnDim": 1024
}
{
  "iters": 50000,
  "batchSize": 20,
  "lr": 0.00005,
  "lrMin": 0.000005,
  "warmupIters": 1000,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 0.000001,
  "weightDecay": 0.1,
  "gradClip": 1,
  "evalInterval": 500,
  "evalIters": 10,
  "seed": 42,
  "backend": "helios",
  "tokenizer": "bpe-4k",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": false,
  "gradAccumSteps": 1,
  "sampleInterval": 500,
  "spikeThreshold": 10,
  "syncEvery": 1,
  "gcEvery": 0,
  "packed": false,
  "symbio": true,
  "symbioConfig": {
    "cusumSensitivity": 4,
    "cusumBaselineWindow": 5,
    "metricsInterval": 10,
    "trackWeightEntropy": true,
    "trackEffectiveRank": true,
    "trackFreeEnergy": true,
    "trackMIProfiles": false,
    "trackPopulationMetrics": true,
    "freeEnergyBeta": 0.01,
    "miNumBins": 30,
    "adaptiveBatch": false,
    "batchMin": 8,
    "batchMax": 64,
    "batchStep": 4,
    "calmStepsBeforeRestore": 200,
    "populationAdaptation": true,
    "populationScaleMin": 0.5,
    "populationScaleMax": 2,
    "populationScaleStep": 0.125,
    "populationAdaptationCooldown": 10,
    "mutationRateMin": 0.2,
    "mutationRateMax": 0.95,
    "fitnessAlpha": 1,
    "complexityMode": "entropy",
    "diversityBonus": 0.1,
    "diversityDecay": "cosine",
    "searchMode": "composed-activation-search",
    "activationPool": [
      "gelu",
      "relu",
      "silu",
      "swiglu",
      "universal",
      "kan_spline"
    ],
    "searchStrategy": "evolutionary",
    "populationSize": 8,
    "generations": 250,
    "selectionStrategy": "topk",
    "tournamentK": 3,
    "mutationRate": 0.7,
    "stepsPerCandidate": 25,
    "rankBy": "valLoss",
    "perfWeight": 0,
    "stabilityWeight": 0,
    "preserveWeightsAcrossCandidates": true,
    "carryOptimizerStateAcrossCandidates": true,
    "constantFfnDimAcrossCandidates": true,
    "fuseWeightsEachStep": true,
    "fusionShadowEma": 0.02,
    "fusionBaseStrength": 0.0015,
    "fusionMaxStrength": 0.02,
    "kuramotoCoupling": 0.7,
    "kuramotoDt": 0.1,
    "kuramotoDamping": 0.05,
    "writeReport": true,
    "writeCandidates": true,
    "writeSummary": true,
    "basisPool": [
      "silu",
      "relu",
      "gelu",
      "identity",
      "square"
    ],
    "maxGraphDepth": 4,
    "maxGraphNodes": 10
  }
}