A
Alpha
novels_all_20260225184007_tt6wactivenovels7.21M params27m 46s elapsed · ~32h 14m remaining
6L / 288D / 6H · helios · bpe · adamw· Created Feb 25, 2026 6:40 PM
Step 1,080 / 50,0002.2%
4.6011
Loss?
4.3488
Best Loss?
-40.0% from start
5.1103
Val Loss?
best: 4.9668
3.00e-4
Learning Rate?
2,555
Throughput?
tok/s (avg)
2,373
Speed?
ms/iter (avg)
0.873
Grad Norm?
avg: 0.824
5.53M
Tokens
processed
185ms
Forward
8% of step
2158ms
Backward
91% of step
10ms
GPU Sync
0% of step
629
GPU Ops
per step
0.3%
MFU
model FLOPS util
11.7x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?6
Embedding?288
Heads?6
Vocab?2,000
Context?256
Dropout?0
Parameters?7.21M
Training Config
Total iters?50,000
Batch size?20
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe
Seed?42
Weight decay?0.1
Grad clip?5
Eval interval?100
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.99
Wt Entropy
bits
20.0
Eff. Rank
4.6210
Free Energy
3.909
Pop Entropy
nats
0.0886
Complexity
0.0900
Fitness
1074
CUSUM Alerts
of 1080 steps
-
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Phase Change / Gelation
Current
Regime Shift
Stability
0%
Phase Changes
22
Regime Shifts
13
Major training disruption. Multiple monitors triggered simultaneously — gradient behavior, clipping, and/or throughput all changed. This can mean the model has found a new loss landscape, or training is becoming unstable. Check learning rate and gradient norms.
Phase Timeline
Step 1Step 1051
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
8
Candidates
44
Activations
19
Best Loss
4.3488
Total Steps
1,080
#CandidateActivationGenLossFitnessStepsMutation
1sq+0.33·silu+0.06·silu-Alpha.1.2.3.4.5.6.7sq+0.33·silu+0.06·silu74.34880.096025add_term
2(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6.7(silu+0.15·id)×silu74.42040.096125clone
3sq+0.33·silu-Alpha.1.2.3.4.5.6.7sq+0.33·silu74.45190.095125clone
4sq+0.31·silu-Alpha.1.2.3.4.5.6sq+0.31·silu64.46530.091225perturb_scale
50.78·relu+0.37·id-Beta.1.2.3.4.5.60.78·relu+0.37·id64.46620.095325prune
6(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6(silu+0.15·id)×silu64.47360.094125prune
70.78·relu+0.37·id+0.1·gelu-Beta.1.2.3.4.5.6.70.78·relu+0.37·id+0.1·gelu74.52810.09005add_term
8(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6(silu+0.15·id)×silu64.53510.089625clone
9sq+0.33·silu-Alpha.1.2.3.4.5.6sq+0.33·silu64.54360.089825clone
100.78·relu+0.37·id-Beta.1.2.3.4.5.6.70.78·relu+0.37·id74.54550.092225clone
11sq+0.33·silu-Alpha.1.2.3.4.5sq+0.33·silu54.61680.090125clone
120.78·relu+0.37·id-Beta.1.2.3.4.5.60.78·relu+0.37·id64.61740.088725clone
13sq+0.33·silu-Alpha.1.2.3.4.5.6sq+0.33·silu64.63590.089725clone
140.78·relu+0.37·id-Beta.1.2.3.4.50.78·relu+0.37·id54.71290.086625prune
15(silu+0.15·id)×silu-Alpha.1.2.3.4.5(silu+0.15·id)×silu54.72630.088025inject_gate
160.78·relu+0.37·id-Beta.1.2.3.4.50.78·relu+0.37·id54.77830.085425clone
17sq+0.33·silu-Alpha.1.2.3.4.5sq+0.33·silu54.84480.085525clone
180.78·relu+0.22·id+0.2·silu-Beta.1.2.3.40.78·relu+0.22·id+0.2·silu44.86320.085225add_term
19(silu+0.15·silu)×relu-Alpha.1.2.3.4(silu+0.15·silu)×relu44.86600.083425inject_gate
20silu+0.33·silu-Alpha.1.2.3.4silu+0.33·silu44.90710.083825prune
Showing top 20 of 44 candidates
Generation Summary
G08c6.2919
G14c5.6304
G24c5.2347
G34c4.9947
G47c4.8632
G55c4.6168
G67c4.4653
G75c4.3488
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
92%
Convergence Momentum
22%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 41
tension 0.403
Strongest Diversity Push
step 506
tension -0.920
Best Frontier
4.3488
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
StepFromToGenPrev StepsBest LossFinal LossFitnessTree
1-silu0----
26silurelu0257.37877.37870.0384
51relugelu0257.22457.22450.0414
76geluid0257.02017.02840.0440
101idsq0256.89816.92480.0461
126sqsilu0256.82796.83060.0467
151silurelu0256.62936.71930.0494
176relugelu0256.47356.55120.0514
201gelusilu+0.15·sq1256.29196.29380.0568
226silu+0.15·sqrelu1256.07256.14320.0601
251relusilu1255.93605.93600.0636
276silurelu1255.80745.80740.0646
301relusilu+0.15·silu2255.63045.67590.0686
326silu+0.15·silurelu2255.53225.56090.0696
351relusilu+0.15·sq2255.37515.44310.0734
376silu+0.15·sq0.9·relu+0.1·id2255.27005.31560.0728
4010.9·relu+0.1·idsilu+0.33·silu3255.23475.26300.0771
426silu+0.33·silu0.78·relu+0.22·id3255.17805.17800.0786
4510.78·relu+0.22·idsilu+0.15·silu3255.08425.08420.0810
476silu+0.15·silurelu3254.99475.07190.0790
501relusq+0.33·silu4255.66429.99600.0080
526sq+0.33·silu0.78·relu+0.37·id4255.60016.03040.0591
5510.78·relu+0.37·idsilu+0.15·id4255.13795.98270.0618
576silu+0.15·id0.9·relu+0.1·id4254.99265.10490.0811
6010.9·relu+0.1·idsilu+0.33·silu4254.90794.90790.0853
626silu+0.33·silu0.78·relu+0.22·id+0.2·silu4254.90714.93260.0838
6510.78·relu+0.22·id+0.2·silu(silu+0.15·silu)×relu4254.86325.03720.0852
676(silu+0.15·silu)×relusq+0.33·silu5254.86604.93260.0834
701sq+0.33·silu0.78·relu+0.37·id5254.84484.84480.0855
7260.78·relu+0.37·id(silu+0.15·id)×silu5254.77834.93370.0854
751(silu+0.15·id)×silusq+0.33·silu5254.72634.77110.0880
776sq+0.33·silu0.78·relu+0.37·id5254.61684.75030.0901
8010.78·relu+0.37·idsq+0.33·silu6254.71294.80430.0866
826sq+0.33·silu0.78·relu+0.37·id6254.63594.75080.0897
8510.78·relu+0.37·id(silu+0.15·id)×silu6254.61744.68990.0887
876(silu+0.15·id)×silusq+0.33·silu6254.53514.70970.0896
901sq+0.33·silusq+0.31·silu6254.54364.64150.0898
926sq+0.31·silu0.78·relu+0.37·id6254.46534.56670.0912
9510.78·relu+0.37·id(silu+0.15·id)×silu6254.46624.46620.0953
976(silu+0.15·id)×silusq+0.33·silu7254.47364.54390.0941
1001sq+0.33·silu0.78·relu+0.37·id7254.45194.45890.0951
10260.78·relu+0.37·id(silu+0.15·id)×silu7254.54554.64750.0922
1051(silu+0.15·id)×silusq+0.33·silu+0.06·silu7254.42044.42040.0961
1076sq+0.33·silu+0.06·silu0.78·relu+0.37·id+0.1·gelu7254.34884.34880.0960
Search Candidates
#NameActivationGenParentStepsBest LossBest ValAvg LossFitnessAvg tok/sAlerts
1sq+0.33·silu-Alpha.1.2.3.4.5.6sq+0.33·silu6sq+0.33·silu-Alpha.1.2.3.4.5254.54364.96684.63810.08982,79525
2sq+0.33·silu-Alpha.1.2.3.4.5.6.7sq+0.33·silu7sq+0.33·silu-Alpha.1.2.3.4.5.6254.45195.11034.54280.09512,76125
30.78·relu+0.37·id-Beta.1.2.3.4.50.78·relu+0.37·id50.78·relu+0.37·id-Beta.1.2.3.4254.71295.34104.82520.08665,90325
40.9·relu+0.1·id-Beta.1.2.3.40.9·relu+0.1·id4relu-Beta.1.2.3254.90795.34375.06260.08535,69025
5sq+0.33·silu-Alpha.1.2.3.4.5sq+0.33·silu5sq+0.33·silu-Alpha.1.2.3.4254.84485.37445.03000.08552,81125
60.9·relu+0.1·id-Beta.1.20.9·relu+0.1·id2relu-Beta.1255.23475.59695.33280.07715,90625
7relu-Beta.1relu1relu-Beta255.63045.95445.83240.06867,75625
8relu-Beta.1.2.3relu3relu-Beta.1.2255.66426.01769.78600.00807,60025
9gelu-Thetagelu0-256.29196.36526.43580.05687,85225
10id-Deltaid0-256.89817.00606.98320.04618,30425
11sq+0.33·silu+0.06·silu-Alpha.1.2.3.4.5.6.7sq+0.33·silu+0.06·silu7sq+0.33·silu-Alpha.1.2.3.4.5.6254.3488-4.49290.09601,81625
12(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6.7(silu+0.15·id)×silu7(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6254.4204-4.50860.09611,88125
13sq+0.31·silu-Alpha.1.2.3.4.5.6sq+0.31·silu6sq+0.33·silu-Alpha.1.2.3.4.5254.4653-4.59500.09122,86525
140.78·relu+0.37·id-Beta.1.2.3.4.5.60.78·relu+0.37·id60.78·relu+0.37·id-Beta.1.2.3.4.5254.4662-4.64320.09535,77725
15(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6(silu+0.15·id)×silu6(silu+0.15·id)×silu-Alpha.1.2.3.4.5254.4736-4.55750.09411,83825
160.78·relu+0.37·id+0.1·gelu-Beta.1.2.3.4.5.6.70.78·relu+0.37·id+0.1·gelu70.78·relu+0.37·id-Beta.1.2.3.4.5.654.5281-4.62150.09004,7315
17(silu+0.15·id)×silu-Alpha.1.2.3.4.5.6(silu+0.15·id)×silu6(silu+0.15·id)×silu-Alpha.1.2.3.4.5254.5351-4.67280.08961,90825
180.78·relu+0.37·id-Beta.1.2.3.4.5.6.70.78·relu+0.37·id70.78·relu+0.37·id-Beta.1.2.3.4.5.6254.5455-4.62980.09225,70925
19sq+0.33·silu-Alpha.1.2.3.4.5sq+0.33·silu5sq+0.33·silu-Alpha.1.2.3.4254.6168-4.76010.09012,79725
200.78·relu+0.37·id-Beta.1.2.3.4.5.60.78·relu+0.37·id60.78·relu+0.37·id-Beta.1.2.3.4.5254.6174-4.75940.08875,92225
21sq+0.33·silu-Alpha.1.2.3.4.5.6sq+0.33·silu6sq+0.33·silu-Alpha.1.2.3.4.5254.6359-4.74600.08972,78225
22(silu+0.15·id)×silu-Alpha.1.2.3.4.5(silu+0.15·id)×silu5silu+0.15·id-Alpha.1.2.3.4254.7263-4.86350.08801,86325
230.78·relu+0.37·id-Beta.1.2.3.4.50.78·relu+0.37·id50.78·relu+0.37·id-Beta.1.2.3.4254.7783-4.92670.08545,88525
240.78·relu+0.22·id+0.2·silu-Beta.1.2.3.40.78·relu+0.22·id+0.2·silu40.78·relu+0.22·id-Beta.1.2.3254.8632-4.94420.08522,51725
25(silu+0.15·silu)×relu-Alpha.1.2.3.4(silu+0.15·silu)×relu4silu+0.15·silu-Alpha.1.2.3254.8660-5.03940.08341,94925
26silu+0.33·silu-Alpha.1.2.3.4silu+0.33·silu4silu+0.33·silu-Alpha.1.2.3254.9071-5.00860.08381,97725
27silu+0.15·id-Alpha.1.2.3.4silu+0.15·id4silu+0.15·silu-Alpha.1.2.3254.9926-5.12330.08112,94925
28silu+0.15·silu-Alpha.1.2.3silu+0.15·silu3silu+0.15·silu-Alpha.1.2254.9947-5.12230.07901,97225
290.78·relu+0.22·id-Beta.1.2.30.78·relu+0.22·id3relu-Beta.1.2255.0842-5.19040.08105,90325
300.78·relu+0.37·id-Beta.1.2.3.40.78·relu+0.37·id40.78·relu+0.22·id-Beta.1.2.3255.1379-5.92510.06185,63025
31silu+0.33·silu-Alpha.1.2.3silu+0.33·silu3silu+0.15·silu-Alpha.1.2255.1780-5.26330.07861,95025
32silu+0.15·sq-Alpha.1.2silu+0.15·sq2silu+0.15·sq-Alpha.1255.2700-5.41680.07282,89325
33relu-Beta.1.2relu2relu-Beta.1255.3751-5.58950.07347,78725
34silu+0.15·silu-Alpha.1.2silu+0.15·silu2silu+0.15·sq-Alpha.1255.5322-5.66380.06961,96925
35sq+0.33·silu-Alpha.1.2.3.4sq+0.33·silu4silu+0.33·silu-Alpha.1.2.3255.6001-5.98450.05912,81125
36silu-Alpha.1silu1silu-Alpha255.8074-5.92220.06463,06925
37relu-Beta.1relu1relu-Beta255.9360-6.08170.06367,65525
38silu+0.15·sq-Alpha.1silu+0.15·sq1silu-Alpha256.0725-6.20700.06012,89925
39relu-Etarelu0-256.4735-6.62090.05147,89925
40silu-Zetasilu0-256.6293-6.75080.04943,17225
41sq-Epsilonsq0-256.8279-6.90770.04677,37225
42gelu-Gammagelu0-257.0201-7.11700.04408,09325
43relu-Betarelu0-257.2245-7.30430.04148,12425
44silu-Alphasilu0-257.3787-7.50420.03843,23319
Activation Distribution
relu
150 (14%)
sq+0.33·silu
150 (14%)
0.78·relu+0.37·id
150 (14%)
(silu+0.15·id)×silu
100 (9%)
silu
75 (7%)
gelu
50 (5%)
silu+0.15·sq
50 (5%)
silu+0.15·silu
50 (5%)
0.9·relu+0.1·id
50 (5%)
silu+0.33·silu
50 (5%)
id
25 (2%)
sq
25 (2%)
0.78·relu+0.22·id
25 (2%)
silu+0.15·id
25 (2%)
0.78·relu+0.22·id+0.2·silu
25 (2%)
(silu+0.15·silu)×relu
25 (2%)
sq+0.31·silu
25 (2%)
sq+0.33·silu+0.06·silu
25 (2%)
0.78·relu+0.37·id+0.1·gelu
5 (0%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
  "cusumSensitivity": 4,
  "cusumBaselineWindow": 5,
  "metricsInterval": 10,
  "trackWeightEntropy": true,
  "trackEffectiveRank": true,
  "trackFreeEnergy": true,
  "trackMIProfiles": false,
  "trackPopulationMetrics": true,
  "freeEnergyBeta": 0.01,
  "miNumBins": 30,
  "adaptiveBatch": false,
  "batchMin": 8,
  "batchMax": 64,
  "batchStep": 4,
  "calmStepsBeforeRestore": 200,
  "populationAdaptation": true,
  "populationScaleMin": 0.5,
  "populationScaleMax": 2,
  "populationScaleStep": 0.125,
  "populationAdaptationCooldown": 10,
  "mutationRateMin": 0.2,
  "mutationRateMax": 0.95,
  "fitnessAlpha": 1,
  "complexityMode": "entropy",
  "diversityBonus": 0.1,
  "diversityDecay": "cosine",
  "searchMode": "composed-activation-search",
  "activationPool": [
    "gelu",
    "relu",
    "silu",
    "swiglu",
    "universal",
    "kan_spline"
  ],
  "searchStrategy": "evolutionary",
  "populationSize": 8,
  "generations": 250,
  "selectionStrategy": "topk",
  "tournamentK": 3,
  "mutationRate": 0.7,
  "stepsPerCandidate": 25,
  "rankBy": "valLoss",
  "perfWeight": 0,
  "stabilityWeight": 0,
  "preserveWeightsAcrossCandidates": true,
  "carryOptimizerStateAcrossCandidates": true,
  "constantFfnDimAcrossCandidates": true,
  "fuseWeightsEachStep": true,
  "fusionShadowEma": 0.02,
  "fusionBaseStrength": 0.0015,
  "fusionMaxStrength": 0.02,
  "kuramotoCoupling": 0.7,
  "kuramotoDt": 0.1,
  "kuramotoDamping": 0.05,
  "writeReport": true,
  "writeCandidates": true,
  "writeSummary": true,
  "basisPool": [
    "silu",
    "relu",
    "gelu",
    "identity",
    "square"
  ],
  "maxGraphDepth": 4,
  "maxGraphNodes": 10
}
Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 5h ago
Prompt
The
Output
The of itsi eed ationss ite. iner ondit ers it a the it ptmin the the monfsein the ded of ere e Fs ein, rekm
2-Once upon a time5h ago
Prompt
Once upon a time
Output
Once upon a timee, ing the inm and ed seting the wed a eming of is oning a s d dmron e linited M ses is ding of res it CEsrl
3-He walked into5h ago
Prompt
He walked into
Output
He walked into the ed , teseited onfof --Cinierrof es , ed ving , s rsy bal aal aMinof eanws sit and oisic edone
{
  "vocabSize": 2000,
  "blockSize": 256,
  "nLayer": 6,
  "nEmbd": 288,
  "nHead": 6,
  "dropout": 0,
  "ffnActivation": "swiglu",
  "ffnDim": 768
}
{
  "iters": 50000,
  "batchSize": 20,
  "lr": 0.0003,
  "lrMin": 0,
  "warmupIters": 500,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 1e-8,
  "weightDecay": 0.1,
  "gradClip": 5,
  "evalInterval": 100,
  "evalIters": 10,
  "seed": 42,
  "backend": "helios",
  "tokenizer": "bpe",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": false,
  "gradAccumSteps": 1,
  "sampleInterval": 100,
  "spikeThreshold": 10,
  "syncEvery": 1,
  "gcEvery": 0,
  "packed": false,
  "symbio": true,
  "symbioConfig": {
    "cusumSensitivity": 4,
    "cusumBaselineWindow": 5,
    "metricsInterval": 10,
    "trackWeightEntropy": true,
    "trackEffectiveRank": true,
    "trackFreeEnergy": true,
    "trackMIProfiles": false,
    "trackPopulationMetrics": true,
    "freeEnergyBeta": 0.01,
    "miNumBins": 30,
    "adaptiveBatch": false,
    "batchMin": 8,
    "batchMax": 64,
    "batchStep": 4,
    "calmStepsBeforeRestore": 200,
    "populationAdaptation": true,
    "populationScaleMin": 0.5,
    "populationScaleMax": 2,
    "populationScaleStep": 0.125,
    "populationAdaptationCooldown": 10,
    "mutationRateMin": 0.2,
    "mutationRateMax": 0.95,
    "fitnessAlpha": 1,
    "complexityMode": "entropy",
    "diversityBonus": 0.1,
    "diversityDecay": "cosine",
    "searchMode": "composed-activation-search",
    "activationPool": [
      "gelu",
      "relu",
      "silu",
      "swiglu",
      "universal",
      "kan_spline"
    ],
    "searchStrategy": "evolutionary",
    "populationSize": 8,
    "generations": 250,
    "selectionStrategy": "topk",
    "tournamentK": 3,
    "mutationRate": 0.7,
    "stepsPerCandidate": 25,
    "rankBy": "valLoss",
    "perfWeight": 0,
    "stabilityWeight": 0,
    "preserveWeightsAcrossCandidates": true,
    "carryOptimizerStateAcrossCandidates": true,
    "constantFfnDimAcrossCandidates": true,
    "fuseWeightsEachStep": true,
    "fusionShadowEma": 0.02,
    "fusionBaseStrength": 0.0015,
    "fusionMaxStrength": 0.02,
    "kuramotoCoupling": 0.7,
    "kuramotoDt": 0.1,
    "kuramotoDamping": 0.05,
    "writeReport": true,
    "writeCandidates": true,
    "writeSummary": true,
    "basisPool": [
      "silu",
      "relu",
      "gelu",
      "identity",
      "square"
    ],
    "maxGraphDepth": 4,
    "maxGraphNodes": 10
  }
}