historic_chat_v2_20260225114638_5ke3completedchat9.99M params5m 10s elapsed · Updated 10m ago
6L / 256D / 8H · helios · bpe-4k · adamw· Created Feb 25, 2026 11:47 AM
Step 500 / 500100.0%
7.2586
Loss?
7.2104
Best Loss?
-13.1% from start
7.3268
Val Loss?
best: 7.3268
5.03e-6
Learning Rate?
2,615
Throughput?
tok/s (avg)
802
Speed?
ms/iter (avg)
781.794
Grad Norm?
avg: 164.251
890.9K
Tokens
processed
84ms
Forward
10% of step
683ms
Backward
85% of step
10ms
GPU Sync
1% of step
596
GPU Ops
per step
0.5%
MFU
model FLOPS util
8.2x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?6
Embedding?256
Heads?8
Vocab?4,000
Context?256
Dropout?0.1
Parameters?9.99M
Training Config
Total iters?500
Batch size?8
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe-4k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?100
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.84
Wt Entropy
bits
20.0
Eff. Rank
7.2384
Free Energy
3.911
Pop Entropy
nats
0.0820
Complexity
0.0397
Fitness
66
CUSUM Alerts
of 435 steps
-
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Phase Change / Gelation
Current
Stable
Stability
67%
Phase Changes
1
Regime Shifts
0
Training is in a steady state — loss is decreasing predictably. The model is learning without disruptions.
Phase Timeline
Step 1Step 462
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
2
Candidates
8
Activations
4
Best Loss
7.7010
Total Steps
152
| # | Candidate | Activation | Gen | Loss | Fitness | Steps | Mutation |
|---|---|---|---|---|---|---|---|
| 1 | gen1_relu_6 | relu | 1 | 7.7010 | 0.0327 | 20 | origin |
| 2 | gen0_relu_3 | relu | 0 | 7.7401 | 0.0318 | 20 | origin |
| 3 | gen1_gelu_7 | gelu | 1 | 7.7595 | 0.0315 | 20 | origin |
| 4 | gen1_swiglu_5 | swiglu | 1 | 7.7882 | 0.0296 | 18 | origin |
| 5 | gen0_swiglu_4 | swiglu | 0 | 7.7973 | 0.0301 | 19 | origin |
| 6 | gen1_silu_8 | silu | 1 | 7.8679 | 0.0280 | 17 | origin |
| 7 | gen0_silu_2 | silu | 0 | 8.0259 | 0.0284 | 18 | origin |
| 8 | gen0_gelu_1 | gelu | 0 | 8.1103 | 0.0266 | 20 | origin |
Generation Summary
G04c7.7401
G14c7.7010
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
92%
Convergence Momentum
48%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 9
tension 0.200
Strongest Diversity Push
step 103
tension -0.980
Best Frontier
7.7010
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
| Step | From | To | Gen | Prev Steps | Best Loss | Final Loss | Fitness | Tree | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | - | → | gelu | 0 | - | - | - | - | |
| 21 | gelu | → | silu | 0 | 20 | 8.1103 | 8.1103 | 0.0266 | |
| 41 | silu | → | relu | 0 | 18 | 8.0259 | 8.0284 | 0.0284 | |
| 61 | relu | → | swiglu | 0 | 20 | 7.7401 | 7.7401 | 0.0318 | |
| 81 | swiglu | → | swiglu | 1 | 19 | 7.7973 | 7.7973 | 0.0301 | |
| 101 | swiglu | → | relu | 1 | 18 | 7.7882 | 7.7882 | 0.0296 | |
| 121 | relu | → | gelu | 1 | 20 | 7.7010 | 7.7010 | 0.0327 | |
| 141 | gelu | → | silu | 1 | 20 | 7.7595 | 7.7648 | 0.0315 |
Search Candidates
| # | Name | Activation | Gen | Parent | Steps | Best Loss | Best Val | Avg Loss | Fitness | Avg tok/s | Alerts |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | gen1_swiglu_5 | swiglu | 1 | - | 18 | 7.7882 | 8.3418 | 8.0438 | 0.0296 | 2,615 | 0 |
| 2 | gen1_relu_6 | relu | 1 | - | 20 | 7.7010 | - | 7.9813 | 0.0327 | 6,960 | 12 |
| 3 | gen0_relu_3 | relu | 0 | - | 20 | 7.7401 | - | 8.0002 | 0.0318 | 6,716 | 11 |
| 4 | gen1_gelu_7 | gelu | 1 | - | 20 | 7.7595 | - | 8.0051 | 0.0315 | 6,903 | 20 |
| 5 | gen0_swiglu_4 | swiglu | 0 | - | 19 | 7.7973 | - | 8.0485 | 0.0301 | 2,627 | 4 |
| 6 | gen1_silu_8 | silu | 1 | - | 17 | 7.8679 | - | 8.0770 | 0.0280 | 2,872 | 12 |
| 7 | gen0_silu_2 | silu | 0 | - | 18 | 8.0259 | - | 8.1620 | 0.0284 | 2,555 | 2 |
| 8 | gen0_gelu_1 | gelu | 0 | - | 20 | 8.1103 | - | 8.2350 | 0.0266 | 6,571 | 5 |
Activation Distribution
gelu
40 (26%)
relu
40 (26%)
swiglu
37 (24%)
silu
35 (23%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": false,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": false,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0,
"diversityDecay": "none",
"searchMode": "ffn-activation-search",
"activationPool": [
"gelu",
"silu",
"relu",
"swiglu"
],
"searchStrategy": "evolutionary",
"populationSize": 4,
"generations": 2,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.5,
"stepsPerCandidate": 20,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true
}Sample Generations (5)
#CheckpointPrompt (preview)Generated
1-<|user|> Hello, how are you? <|assistant|>12h ago
Prompt
<|user|> Hello, how are you? <|assistant|>
Output
<|user|> Hello, how are you? <|assistant|>ayour and ethe the cthat the to a , that ; and to is a es e, in geto ing your and , ing e’s is a that a I , the e, and and I , a or to or s e, and de
2-<|user|> What do you like to do for fun? <|assistant|>12h ago
Prompt
<|user|> What do you like to do for fun? <|assistant|>
Output
<|user|> What do you like to do for fun? <|assistant|>and or is es e, and cin eyour must ’s it in yes ima es, , to , for that mit or e, is a and the as , to tes is for I we tis is a ato , ing I
3-<|user|> Tell me about yourself. <|assistant|>12h ago
Prompt
<|user|> Tell me about yourself. <|assistant|>
Output
<|user|> Tell me about yourself. <|assistant|>the aed gI e, es, a; we we the ed ms to —that is must s n, for es that is, is a e, it eaiy eto ; it , sand s, in es cto ns, ing
4-<|user|> What is the meaning of life? <|assistant|>12h ago
Prompt
<|user|> What is the meaning of life? <|assistant|>
Output
<|user|> What is the meaning of life? <|assistant|>eds of predined and tt; not m, or to cingthe living ythe Mthe ἀwhich divine the mof h, they your through Wathat moral , to and a conin ye, and rearis and human is to an ed or not sh, for moral hyleft t de—fons—with the sanyaniales must be y of the sad, but , is your on the c? <|assistant|> towperpetrule whthe fewtionintriguing precs is a laws of
5-<|user|> Can you help me with something? <|assistant|>12h ago
Prompt
<|user|> Can you help me with something? <|assistant|>
Output
<|user|> Can you help me with something? <|assistant|>, must a divine e, to dekanot our chs but that inaand as and flamts of of your h, of embodfrom the s cs al must the burden, as a and cam, to divine to os our our hent ts may shadow oned must serve sy, in ’s your ses, a laws ; true a from a that moral and that and to from se, with higher al recepfor conwith mere o aning by to unes
{
"vocabSize": 4000,
"blockSize": 256,
"nLayer": 6,
"nEmbd": 256,
"nHead": 8,
"dropout": 0.1,
"ffnActivation": "swiglu",
"ffnDim": 704
}{
"iters": 500,
"batchSize": 8,
"lr": 0.0003,
"lrMin": 0.000005,
"warmupIters": 50,
"beta1": 0.9,
"beta2": 0.95,
"eps": 0.000001,
"weightDecay": 0.1,
"gradClip": 1,
"evalInterval": 100,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe-4k",
"optimizer": "adamw",
"logLevel": "info",
"trace": false,
"gradAccumSteps": 1,
"sampleInterval": 200,
"spikeThreshold": 10,
"syncEvery": 1,
"gcEvery": 0,
"packed": false,
"symbio": true,
"symbioConfig": {
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": false,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": false,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0,
"diversityDecay": "none",
"searchMode": "ffn-activation-search",
"activationPool": [
"gelu",
"silu",
"relu",
"swiglu"
],
"searchStrategy": "evolutionary",
"populationSize": 4,
"generations": 2,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.5,
"stepsPerCandidate": 20,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true
}
}