A
Alpha
novels_all_20260225162239_kx12activenovels7.21M params15m 34s elapsed · ~14h 37m remaining
6L / 288D / 6H · helios · bpe · adamw· Created Feb 25, 2026 4:22 PM
Step 654 / 50,0001.3%
7.5762
Loss?
6.9151
Best Loss?
-1.0% from start
7.6607
Val Loss?
best: 7.6550
3.00e-4
Learning Rate?
4,920
Throughput?
tok/s (avg)
1,067
Speed?
ms/iter (avg)
0.916
Grad Norm?
avg: 0.582
3.33M
Tokens
processed
125ms
Forward
12% of step
866ms
Backward
81% of step
19ms
GPU Sync
2% of step
584
GPU Ops
per step
0.7%
MFU
model FLOPS util
6.9x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?6
Embedding?288
Heads?6
Vocab?2,000
Context?256
Dropout?0
Parameters?7.21M
Training Config
Total iters?50,000
Batch size?20
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe
Seed?42
Weight decay?0.1
Grad clip?5
Eval interval?100
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.78
Wt Entropy
bits
20.0
Eff. Rank
7.0170
Free Energy
3.894
Pop Entropy
nats
0.0791
Complexity
0.0459
Fitness
638
CUSUM Alerts
of 650 steps
12
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Adaptive Batch Size
Phase Change / Gelation
Current
Transitioning
Stability
0%
Phase Changes
13
Regime Shifts
7
Training dynamics are shifting. The model may be entering a new loss basin or the learning rate is hitting a critical threshold. This often happens before a breakthrough or a plateau.
Phase Timeline
Step 1Step 603
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
4
Candidates
27
Activations
12
Best Loss
6.9151
Total Steps
650
#CandidateActivationGenLossFitnessStepsMutation
1id-Delta.1.2id26.91510.044925perturb_scale
2gelu×gelu-Gamma.1.2gelu×gelu26.94760.044925prune
3relu-Beta.1.2relu26.94790.045425prune
40.68·relu+0.32·id-Beta.1.20.68·relu+0.32·id26.94900.044325inject_residual
5gelu-Delta.1gelu16.95610.045425inject_gate
60.85·silu+0.15·id-Alpha.1.2.30.85·silu+0.15·id36.95770.045625clone
7id-Delta.1.2id26.97020.045325perturb_scale
8silu-Alpha.1.2silu26.97290.045325clone
90.85·silu+0.15·id-Alpha.1.20.85·silu+0.15·id26.98590.044123inject_residual
10relu-Beta.1.2.3relu36.98780.045923prune
11gelu-Gamma.1gelu17.00340.043525perturb_scale
12gelu×gelu-Gamma.1gelu×gelu17.02750.043725inject_gate
13id-Delta.1id17.06820.045925clone
140.74·silu+0.26·id-Alpha.10.74·silu+0.26·id17.07140.044325inject_residual
15relu-Beta.1relu17.07520.044525swap_basis
16relu×gelu-Beta.1relu×gelu17.08210.044725inject_gate
17gelu-Thetagelu07.12010.044125origin
18relu-Etarelu07.14580.042125origin
19silu-Alpha.1silu17.15230.043625clone
20gelu×gelu×silu-Gamma.1.2gelu×gelu×silu27.15450.040525inject_gate
Showing top 20 of 27 candidates
Generation Summary
G08c7.1201
G18c6.9561
G28c6.9151
G33c6.9577
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
82%
Convergence Momentum
0%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 10
tension 0.500
Strongest Diversity Push
step 240
tension -0.820
Best Frontier
6.9151
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
StepFromToGenPrev StepsBest LossFinal LossFitnessTree
1-silu0----
26silurelu0257.41197.41190.0383
51relugelu0257.32067.32060.0401
76geluid0257.27937.28490.0404
101idsq0257.29337.29520.0414
126sqsilu0257.20357.20350.0419
151silurelu0257.26447.27910.0416
176relugelu0257.14587.15570.0421
201gelusilu1257.12017.12010.0441
226silurelu1257.15237.16250.0436
251relugelu×gelu1257.07527.07520.0445
276gelu×geluid1257.02757.02750.0437
301id0.74·silu+0.26·id1257.06827.06820.0459
3260.74·silu+0.26·idrelu×gelu1257.07147.07140.0443
351relu×gelugelu1257.08217.08210.0447
376gelugelu1257.00347.00340.0435
401gelu0.85·silu+0.15·id2256.95616.97060.0454
4260.85·silu+0.15·idrelu2236.98596.98590.0441
451relugelu×gelu2256.94796.94790.0454
476gelu×geluid2256.94766.94760.0449
501idsilu2256.91516.98880.0449
526silu0.68·relu+0.32·id2256.97296.99900.0453
5510.68·relu+0.32·idgelu×gelu×silu2256.94906.97920.0443
576gelu×gelu×siluid2257.15457.15450.0405
601id0.85·silu+0.15·id3256.97026.97020.0453
6260.85·silu+0.15·idrelu3256.95776.95770.0456
651relugelu×gelu+0.08·sq3236.98786.99920.0459
Search Candidates
#NameActivationGenParentStepsBest LossBest ValAvg LossFitnessAvg tok/sAlerts
1id-Delta.1.2id2id-Delta.1256.91517.65507.23880.04495,30725
2id-Delta.1id1id-Delta257.06827.65867.30660.04594,96925
3id-Delta.1.2id2id-Delta.1256.97027.66077.23130.04534,99925
4gelu-Thetagelu0-257.12017.66147.34120.04415,45025
5gelu-Delta.1gelu1id-Delta256.95617.66617.21800.04545,49425
6id-Deltaid0-257.29337.66687.44610.04145,70425
7gelu×gelu-Gamma.1.2gelu×gelu2gelu×gelu-Gamma.1256.9476-7.17760.04494,89125
8relu-Beta.1.2relu2relu-Beta.1256.9479-7.18750.04545,15625
90.68·relu+0.32·id-Beta.1.20.68·relu+0.32·id2relu-Beta.1256.9490-7.19850.04433,78825
100.85·silu+0.15·id-Alpha.1.2.30.85·silu+0.15·id30.85·silu+0.15·id-Alpha.1.2256.9577-7.23690.04562,03325
11silu-Alpha.1.2silu2silu-Alpha.1256.9729-7.17520.04532,27325
120.85·silu+0.15·id-Alpha.1.20.85·silu+0.15·id2silu-Alpha.1236.9859-7.26330.04412,05223
13relu-Beta.1.2.3relu3relu-Beta.1.2236.9878-7.23040.04595,28923
14gelu-Gamma.1gelu1gelu-Gamma257.0034-7.23110.04355,15725
15gelu×gelu-Gamma.1gelu×gelu1gelu-Gamma257.0275-7.28870.04374,58225
160.74·silu+0.26·id-Alpha.10.74·silu+0.26·id1silu-Alpha257.0714-7.28380.04432,12025
17relu-Beta.1relu1relu-Beta257.0752-7.29940.04455,44525
18relu×gelu-Beta.1relu×gelu1relu-Beta257.0821-7.34640.04474,92625
19relu-Etarelu0-257.1458-7.34640.04215,44925
20silu-Alpha.1silu1silu-Alpha257.1523-7.33870.04362,23525
21gelu×gelu×silu-Gamma.1.2gelu×gelu×silu2gelu×gelu-Gamma.1257.1545-7.42480.04052,15225
22sq-Epsilonsq0-257.2035-7.37500.04195,32825
23silu-Zetasilu0-257.2644-7.42930.04162,25725
24gelu-Gammagelu0-257.2793-7.42590.04045,60825
25relu-Betarelu0-257.3206-7.42680.04015,94925
26silu-Alphasilu0-257.4119-7.52170.03832,29513
27gelu×gelu+0.08·sq-Gamma.1.2.3gelu×gelu+0.08·sq3gelu×gelu-Gamma.1.247.5762-7.6128-3,9704
Activation Distribution
relu
123 (19%)
silu
100 (15%)
gelu
100 (15%)
id
100 (15%)
gelu×gelu
50 (8%)
0.85·silu+0.15·id
48 (7%)
sq
25 (4%)
0.74·silu+0.26·id
25 (4%)
relu×gelu
25 (4%)
0.68·relu+0.32·id
25 (4%)
gelu×gelu×silu
25 (4%)
gelu×gelu+0.08·sq
4 (1%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
  "cusumSensitivity": 4,
  "cusumBaselineWindow": 5,
  "metricsInterval": 10,
  "trackWeightEntropy": true,
  "trackEffectiveRank": true,
  "trackFreeEnergy": true,
  "trackMIProfiles": false,
  "trackPopulationMetrics": true,
  "freeEnergyBeta": 0.01,
  "miNumBins": 30,
  "adaptiveBatch": true,
  "batchMin": 8,
  "batchMax": 64,
  "batchStep": 4,
  "calmStepsBeforeRestore": 200,
  "fitnessAlpha": 1,
  "complexityMode": "entropy",
  "diversityBonus": 0.1,
  "diversityDecay": "cosine",
  "searchMode": "composed-activation-search",
  "activationPool": [
    "gelu",
    "relu",
    "silu",
    "swiglu",
    "universal",
    "kan_spline"
  ],
  "searchStrategy": "evolutionary",
  "populationSize": 8,
  "generations": 250,
  "selectionStrategy": "topk",
  "tournamentK": 3,
  "mutationRate": 0.7,
  "stepsPerCandidate": 25,
  "rankBy": "valLoss",
  "perfWeight": 0,
  "stabilityWeight": 0,
  "writeReport": true,
  "writeCandidates": true,
  "writeSummary": true,
  "basisPool": [
    "silu",
    "relu",
    "gelu",
    "identity",
    "square"
  ],
  "maxGraphDepth": 4,
  "maxGraphNodes": 10
}
Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 4h ago
Prompt
The
Output
The something thought contwasforERations intelligcould decreferimple people en the s to le n'trequbeforGPheiter couldyouof sembctustill ely sput . They 't with phwas alsoaxbookpresentchestembtheme t as a crebackgerlimwesu
2-Once upon a time4h ago
Prompt
Once upon a time
Output
Once upon a timebre there promptdescriTraves ativfirste. The that was ese they were pter eathot on ction- storFetchastwas slolininstESode ================================otword whiTHE , andINsomeone is s." afterorgphrolternpiies Febru andt ite datoll
3-He walked into4h ago
Prompt
He walked into
Output
He walked into callpurOptionrun turnhaddro. It anc," ing that lessum thanust ed. anddata place ponentexact. Nboth . It ind Cdatect for a atmod. Bator if was the him - illits? artplac whatations ownwrite ed on The ridually
{
  "vocabSize": 2000,
  "blockSize": 256,
  "nLayer": 6,
  "nEmbd": 288,
  "nHead": 6,
  "dropout": 0,
  "ffnActivation": "swiglu",
  "ffnDim": 768
}
{
  "iters": 50000,
  "batchSize": 20,
  "lr": 0.0003,
  "lrMin": 0,
  "warmupIters": 500,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 1e-8,
  "weightDecay": 0.1,
  "gradClip": 5,
  "evalInterval": 100,
  "evalIters": 10,
  "seed": 42,
  "backend": "helios",
  "tokenizer": "bpe",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": false,
  "gradAccumSteps": 1,
  "sampleInterval": 100,
  "spikeThreshold": 10,
  "syncEvery": 1,
  "gcEvery": 0,
  "packed": false,
  "symbio": true,
  "symbioConfig": {
    "cusumSensitivity": 4,
    "cusumBaselineWindow": 5,
    "metricsInterval": 10,
    "trackWeightEntropy": true,
    "trackEffectiveRank": true,
    "trackFreeEnergy": true,
    "trackMIProfiles": false,
    "trackPopulationMetrics": true,
    "freeEnergyBeta": 0.01,
    "miNumBins": 30,
    "adaptiveBatch": true,
    "batchMin": 8,
    "batchMax": 64,
    "batchStep": 4,
    "calmStepsBeforeRestore": 200,
    "fitnessAlpha": 1,
    "complexityMode": "entropy",
    "diversityBonus": 0.1,
    "diversityDecay": "cosine",
    "searchMode": "composed-activation-search",
    "activationPool": [
      "gelu",
      "relu",
      "silu",
      "swiglu",
      "universal",
      "kan_spline"
    ],
    "searchStrategy": "evolutionary",
    "populationSize": 8,
    "generations": 250,
    "selectionStrategy": "topk",
    "tournamentK": 3,
    "mutationRate": 0.7,
    "stepsPerCandidate": 25,
    "rankBy": "valLoss",
    "perfWeight": 0,
    "stabilityWeight": 0,
    "writeReport": true,
    "writeCandidates": true,
    "writeSummary": true,
    "basisPool": [
      "silu",
      "relu",
      "gelu",
      "identity",
      "square"
    ],
    "maxGraphDepth": 4,
    "maxGraphNodes": 10
  }
}