novels_all_20260225153615_hxgaactivenovels15.90M params18m 36s elapsed · ~116h 57m remaining
8L / 384D / 8H · helios · bpe · adamw· Created Feb 25, 2026 3:36 PM
Step 230 / 50,0000.5%
7.2422
Loss?
7.0777
Best Loss?
-5.7% from start
-
Val Loss?
1.54e-4
Learning Rate?
4,942
Throughput?
tok/s (avg)
8,460
Speed?
ms/iter (avg)
0.625
Grad Norm?
avg: 0.737
1.49M
Tokens
processed
3657ms
Forward
43% of step
4705ms
Backward
56% of step
24ms
GPU Sync
0% of step
747
GPU Ops
per step
0.3%
MFU
model FLOPS util
1.3x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?8
Embedding?384
Heads?8
Vocab?2,000
Context?512
Dropout?0
Parameters?15.90M
Training Config
Total iters?50,000
Batch size?16
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe
Seed?42
Weight decay?0.1
Grad clip?5
Eval interval?500
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.73
Wt Entropy
bits
20.0
Eff. Rank
7.2596
Free Energy
3.900
Pop Entropy
nats
0.0734
Complexity
0.0479
Fitness
18
CUSUM Alerts
of 182 steps
8
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Adaptive Batch Size
Phase Change / Gelation
Current
Minor Fluctuation
Stability
0%
Phase Changes
0
Regime Shifts
0
Small disturbances detected. Usually harmless — could be a batch of unusual data or a learning rate schedule change. Watch for escalation.
Phase Timeline
Step 1Step 171
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
2
Candidates
12
Activations
6
Best Loss
7.0777
Total Steps
182
| # | Candidate | Activation | Gen | Loss | Fitness | Steps | Mutation |
|---|---|---|---|---|---|---|---|
| 1 | K-Beta.1 | kan_spline | 1 | 7.0777 | 0.0511 | 2 | origin |
| 2 | S-Beta.1 | silu | 1 | 7.0872 | 0.0506 | 20 | origin |
| 3 | G-Alpha.1 | gelu | 1 | 7.0920 | 0.0506 | 10 | origin |
| 4 | S-Gamma.1 | silu | 1 | 7.1056 | 0.0509 | 20 | origin |
| 5 | Sw-Alpha.1 | swiglu | 1 | 7.1204 | 0.0470 | 20 | origin |
| 6 | K-Zeta | kan_spline | 0 | 7.1686 | 0.0488 | 2 | origin |
| 7 | U-Epsilon | universal | 0 | 7.1915 | 0.0475 | 18 | origin |
| 8 | R-Gamma.1 | relu | 1 | 7.2422 | 0.0479 | 10 | origin |
| 9 | Sw-Delta | swiglu | 0 | 7.2425 | 0.0451 | 20 | origin |
| 10 | R-Gamma | relu | 0 | 7.3102 | 0.0466 | 20 | origin |
| 11 | S-Beta | silu | 0 | 7.3230 | 0.0467 | 20 | origin |
| 12 | G-Alpha | gelu | 0 | 7.3362 | 0.0457 | 20 | origin |
Generation Summary
G06c7.1686
G16c7.0777
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
85%
Convergence Momentum
13%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 159
tension 0.213
Strongest Diversity Push
step 1
tension -0.800
Best Frontier
7.0777
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
| Step | From | To | Gen | Prev Steps | Best Loss | Final Loss | Fitness | Tree | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | - | → | gelu | 0 | - | - | - | - | |
| 21 | gelu | → | silu | 0 | 20 | 7.3362 | 7.3445 | 0.0457 | |
| 41 | silu | → | relu | 0 | 20 | 7.3230 | 7.3230 | 0.0467 | |
| 61 | relu | → | swiglu | 0 | 20 | 7.3102 | 7.3102 | 0.0466 | |
| 81 | swiglu | → | universal | 0 | 20 | 7.2425 | 7.2425 | 0.0451 | |
| 119 | universal | → | kan_spline | 0 | 18 | 7.1915 | 7.1925 | 0.0475 | |
| 121 | kan_spline | → | swiglu | 1 | 2 | 7.1686 | 7.1686 | 0.0488 | |
| 141 | swiglu | → | silu | 1 | 20 | 7.1204 | 7.1204 | 0.0470 | |
| 161 | silu | → | silu | 1 | 20 | 7.0872 | 7.0872 | 0.0506 | |
| 191 | silu | → | gelu | 1 | 20 | 7.1056 | 7.1056 | 0.0509 | |
| 219 | gelu | → | kan_spline | 1 | 10 | 7.0920 | 7.0920 | 0.0506 | |
| 221 | kan_spline | → | relu | 1 | 2 | 7.0777 | 7.0777 | 0.0511 |
Search Candidates
| # | Name | Activation | Gen | Parent | Steps | Best Loss | Best Val | Avg Loss | Fitness | Avg tok/s | Alerts |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | K-Beta.1 | kan_spline | 1 | S-Beta | 2 | 7.0777 | - | 7.0811 | 0.0511 | 116 | 0 |
| 2 | S-Beta.1 | silu | 1 | S-Beta | 20 | 7.0872 | - | 7.2784 | 0.0506 | 2,049 | 2 |
| 3 | G-Alpha.1 | gelu | 1 | G-Alpha | 10 | 7.0920 | - | 7.1658 | 0.0506 | 6,445 | 0 |
| 4 | S-Gamma.1 | silu | 1 | R-Gamma | 20 | 7.1056 | - | 7.3199 | 0.0509 | 2,030 | 0 |
| 5 | Sw-Alpha.1 | swiglu | 1 | G-Alpha | 20 | 7.1204 | - | 7.3250 | 0.0470 | 1,933 | 0 |
| 6 | K-Zeta | kan_spline | 0 | - | 2 | 7.1686 | - | 7.1718 | 0.0488 | 159 | 0 |
| 7 | U-Epsilon | universal | 0 | - | 18 | 7.1915 | - | 7.3591 | 0.0475 | 381 | 0 |
| 8 | R-Gamma.1 | relu | 1 | R-Gamma | 10 | 7.2422 | - | 7.3933 | 0.0479 | 4,448 | 3 |
| 9 | Sw-Delta | swiglu | 0 | - | 20 | 7.2425 | - | 7.4180 | 0.0451 | 1,975 | 0 |
| 10 | R-Gamma | relu | 0 | - | 20 | 7.3102 | - | 7.4561 | 0.0466 | 7,050 | 10 |
| 11 | S-Beta | silu | 0 | - | 20 | 7.3230 | - | 7.4642 | 0.0467 | 2,072 | 0 |
| 12 | G-Alpha | gelu | 0 | - | 20 | 7.3362 | - | 7.4664 | 0.0457 | 7,113 | 3 |
Activation Distribution
silu
60 (33%)
swiglu
40 (22%)
gelu
30 (16%)
relu
30 (16%)
universal
18 (10%)
kan_spline
4 (2%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": true,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": true,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0.08,
"diversityDecay": "cosine",
"searchMode": "ffn-activation-search",
"activationPool": [
"gelu",
"silu",
"relu",
"swiglu",
"universal",
"kan_spline"
],
"searchStrategy": "evolutionary",
"populationSize": 6,
"generations": 416,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.6,
"stepsPerCandidate": 20,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true
}Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 8h ago
Prompt
The
Output
The 3kind nightonepowerdubelievaybuilding ed and ractdidn't writcontainually discnot just halfreal row codis the aceepceptgenerat modelwindows, imagperson who nightn't eventlayerartrawpusfor e of another fiobservYod therupped game ured
2-Once upon a time8h ago
Prompt
Once upon a time
Output
Once upon a timeing sgramost hundishweframeing that ciso ption as ragusprimlinOmegaof cinterface identdepno problembut y of ================================================================================n ext browselfle serv agbot er put ansframeworkresponse up ofBlock00fortethsystembut attic etch
3-He walked into8h ago
Prompt
He walked into
Output
He walked intoresponsGitof cunithemtern7encotnameachmachins showptiondidn't echnfortbeneath datpresentresumlike a roomimagice waswhole new ediing a pped servq whiountcoppixelagents what tryfunctionsame ing a qcreome ar. S. They
{
"vocabSize": 2000,
"blockSize": 512,
"nLayer": 8,
"nEmbd": 384,
"nHead": 8,
"dropout": 0,
"ffnActivation": "swiglu",
"ffnDim": 1024
}{
"iters": 50000,
"batchSize": 16,
"lr": 0.0003,
"lrMin": 0,
"warmupIters": 500,
"beta1": 0.9,
"beta2": 0.95,
"eps": 1e-8,
"weightDecay": 0.1,
"gradClip": 5,
"evalInterval": 500,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe",
"optimizer": "adamw",
"logLevel": "info",
"trace": false,
"gradAccumSteps": 1,
"sampleInterval": 100,
"spikeThreshold": 10,
"syncEvery": 1,
"gcEvery": 0,
"packed": false,
"symbio": true,
"symbioConfig": {
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": true,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": true,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0.08,
"diversityDecay": "cosine",
"searchMode": "ffn-activation-search",
"activationPool": [
"gelu",
"silu",
"relu",
"swiglu",
"universal",
"kan_spline"
],
"searchStrategy": "evolutionary",
"populationSize": 6,
"generations": 416,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.6,
"stepsPerCandidate": 20,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true
}
}