novels_all_20260225184007_tt6wactivenovels7.21M params27m 46s elapsed · ~32h 14m remaining
6L / 288D / 6H · helios · bpe · adamw· Created Feb 25, 2026 6:40 PM
Step 1,080 / 50,0002.2%
4.6011
Loss?
4.3488
Best Loss?
-40.0% from start
5.1103
Val Loss?
best: 4.9668
3.00e-4
Learning Rate?
2,555
Throughput?
tok/s (avg)
2,373
Speed?
ms/iter (avg)
0.873
Grad Norm?
avg: 0.824
5.53M
Tokens
processed
185ms
Forward
8% of step
2158ms
Backward
91% of step
10ms
GPU Sync
0% of step
629
GPU Ops
per step
0.3%
MFU
model FLOPS util
11.7x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?6
Embedding?288
Heads?6
Vocab?2,000
Context?256
Dropout?0
Parameters?7.21M
Training Config
Total iters?50,000
Batch size?20
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe
Seed?42
Weight decay?0.1
Grad clip?5
Eval interval?100
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.99
Wt Entropy
bits
20.0
Eff. Rank
4.6210
Free Energy
3.909
Pop Entropy
nats
0.0886
Complexity
0.0900
Fitness
1074
CUSUM Alerts
of 1080 steps
-
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Phase Change / Gelation
Current
Regime Shift
Stability
0%
Phase Changes
22
Regime Shifts
13
Major training disruption. Multiple monitors triggered simultaneously — gradient behavior, clipping, and/or throughput all changed. This can mean the model has found a new loss landscape, or training is becoming unstable. Check learning rate and gradient norms.
Phase Timeline
Step 1Step 1051
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
8
Candidates
44
Activations
19
Best Loss
4.3488
Total Steps
1,080
| # | Candidate | Activation | Gen | Loss | Fitness | Steps | Mutation |
|---|---|---|---|---|---|---|---|
| 1 | sq+0.33·silu+0.06·silu-Alpha.1.2.3.4.5.6.7 | sq+0.33·silu+0.06·silu | 7 | 4.3488 | 0.0960 | 25 | add_term |
| 2 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6.7 | (silu+0.15·id)×silu | 7 | 4.4204 | 0.0961 | 25 | clone |
| 3 | sq+0.33·silu-Alpha.1.2.3.4.5.6.7 | sq+0.33·silu | 7 | 4.4519 | 0.0951 | 25 | clone |
| 4 | sq+0.31·silu-Alpha.1.2.3.4.5.6 | sq+0.31·silu | 6 | 4.4653 | 0.0912 | 25 | perturb_scale |
| 5 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6 | 0.78·relu+0.37·id | 6 | 4.4662 | 0.0953 | 25 | prune |
| 6 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6 | (silu+0.15·id)×silu | 6 | 4.4736 | 0.0941 | 25 | prune |
| 7 | 0.78·relu+0.37·id+0.1·gelu-Beta.1.2.3.4.5.6.7 | 0.78·relu+0.37·id+0.1·gelu | 7 | 4.5281 | 0.0900 | 5 | add_term |
| 8 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6 | (silu+0.15·id)×silu | 6 | 4.5351 | 0.0896 | 25 | clone |
| 9 | sq+0.33·silu-Alpha.1.2.3.4.5.6 | sq+0.33·silu | 6 | 4.5436 | 0.0898 | 25 | clone |
| 10 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6.7 | 0.78·relu+0.37·id | 7 | 4.5455 | 0.0922 | 25 | clone |
| 11 | sq+0.33·silu-Alpha.1.2.3.4.5 | sq+0.33·silu | 5 | 4.6168 | 0.0901 | 25 | clone |
| 12 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6 | 0.78·relu+0.37·id | 6 | 4.6174 | 0.0887 | 25 | clone |
| 13 | sq+0.33·silu-Alpha.1.2.3.4.5.6 | sq+0.33·silu | 6 | 4.6359 | 0.0897 | 25 | clone |
| 14 | 0.78·relu+0.37·id-Beta.1.2.3.4.5 | 0.78·relu+0.37·id | 5 | 4.7129 | 0.0866 | 25 | prune |
| 15 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5 | (silu+0.15·id)×silu | 5 | 4.7263 | 0.0880 | 25 | inject_gate |
| 16 | 0.78·relu+0.37·id-Beta.1.2.3.4.5 | 0.78·relu+0.37·id | 5 | 4.7783 | 0.0854 | 25 | clone |
| 17 | sq+0.33·silu-Alpha.1.2.3.4.5 | sq+0.33·silu | 5 | 4.8448 | 0.0855 | 25 | clone |
| 18 | 0.78·relu+0.22·id+0.2·silu-Beta.1.2.3.4 | 0.78·relu+0.22·id+0.2·silu | 4 | 4.8632 | 0.0852 | 25 | add_term |
| 19 | (silu+0.15·silu)×relu-Alpha.1.2.3.4 | (silu+0.15·silu)×relu | 4 | 4.8660 | 0.0834 | 25 | inject_gate |
| 20 | silu+0.33·silu-Alpha.1.2.3.4 | silu+0.33·silu | 4 | 4.9071 | 0.0838 | 25 | prune |
Showing top 20 of 44 candidates
Generation Summary
G08c6.2919
G14c5.6304
G24c5.2347
G34c4.9947
G47c4.8632
G55c4.6168
G67c4.4653
G75c4.3488
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
92%
Convergence Momentum
22%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 41
tension 0.403
Strongest Diversity Push
step 506
tension -0.920
Best Frontier
4.3488
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
| Step | From | To | Gen | Prev Steps | Best Loss | Final Loss | Fitness | Tree | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | - | → | silu | 0 | - | - | - | - | |
| 26 | silu | → | relu | 0 | 25 | 7.3787 | 7.3787 | 0.0384 | |
| 51 | relu | → | gelu | 0 | 25 | 7.2245 | 7.2245 | 0.0414 | |
| 76 | gelu | → | id | 0 | 25 | 7.0201 | 7.0284 | 0.0440 | |
| 101 | id | → | sq | 0 | 25 | 6.8981 | 6.9248 | 0.0461 | |
| 126 | sq | → | silu | 0 | 25 | 6.8279 | 6.8306 | 0.0467 | |
| 151 | silu | → | relu | 0 | 25 | 6.6293 | 6.7193 | 0.0494 | |
| 176 | relu | → | gelu | 0 | 25 | 6.4735 | 6.5512 | 0.0514 | |
| 201 | gelu | → | silu+0.15·sq | 1 | 25 | 6.2919 | 6.2938 | 0.0568 | |
| 226 | silu+0.15·sq | → | relu | 1 | 25 | 6.0725 | 6.1432 | 0.0601 | |
| 251 | relu | → | silu | 1 | 25 | 5.9360 | 5.9360 | 0.0636 | |
| 276 | silu | → | relu | 1 | 25 | 5.8074 | 5.8074 | 0.0646 | |
| 301 | relu | → | silu+0.15·silu | 2 | 25 | 5.6304 | 5.6759 | 0.0686 | |
| 326 | silu+0.15·silu | → | relu | 2 | 25 | 5.5322 | 5.5609 | 0.0696 | |
| 351 | relu | → | silu+0.15·sq | 2 | 25 | 5.3751 | 5.4431 | 0.0734 | |
| 376 | silu+0.15·sq | → | 0.9·relu+0.1·id | 2 | 25 | 5.2700 | 5.3156 | 0.0728 | |
| 401 | 0.9·relu+0.1·id | → | silu+0.33·silu | 3 | 25 | 5.2347 | 5.2630 | 0.0771 | |
| 426 | silu+0.33·silu | → | 0.78·relu+0.22·id | 3 | 25 | 5.1780 | 5.1780 | 0.0786 | |
| 451 | 0.78·relu+0.22·id | → | silu+0.15·silu | 3 | 25 | 5.0842 | 5.0842 | 0.0810 | |
| 476 | silu+0.15·silu | → | relu | 3 | 25 | 4.9947 | 5.0719 | 0.0790 | |
| 501 | relu | → | sq+0.33·silu | 4 | 25 | 5.6642 | 9.9960 | 0.0080 | |
| 526 | sq+0.33·silu | → | 0.78·relu+0.37·id | 4 | 25 | 5.6001 | 6.0304 | 0.0591 | |
| 551 | 0.78·relu+0.37·id | → | silu+0.15·id | 4 | 25 | 5.1379 | 5.9827 | 0.0618 | |
| 576 | silu+0.15·id | → | 0.9·relu+0.1·id | 4 | 25 | 4.9926 | 5.1049 | 0.0811 | |
| 601 | 0.9·relu+0.1·id | → | silu+0.33·silu | 4 | 25 | 4.9079 | 4.9079 | 0.0853 | |
| 626 | silu+0.33·silu | → | 0.78·relu+0.22·id+0.2·silu | 4 | 25 | 4.9071 | 4.9326 | 0.0838 | |
| 651 | 0.78·relu+0.22·id+0.2·silu | → | (silu+0.15·silu)×relu | 4 | 25 | 4.8632 | 5.0372 | 0.0852 | |
| 676 | (silu+0.15·silu)×relu | → | sq+0.33·silu | 5 | 25 | 4.8660 | 4.9326 | 0.0834 | |
| 701 | sq+0.33·silu | → | 0.78·relu+0.37·id | 5 | 25 | 4.8448 | 4.8448 | 0.0855 | |
| 726 | 0.78·relu+0.37·id | → | (silu+0.15·id)×silu | 5 | 25 | 4.7783 | 4.9337 | 0.0854 | |
| 751 | (silu+0.15·id)×silu | → | sq+0.33·silu | 5 | 25 | 4.7263 | 4.7711 | 0.0880 | |
| 776 | sq+0.33·silu | → | 0.78·relu+0.37·id | 5 | 25 | 4.6168 | 4.7503 | 0.0901 | |
| 801 | 0.78·relu+0.37·id | → | sq+0.33·silu | 6 | 25 | 4.7129 | 4.8043 | 0.0866 | |
| 826 | sq+0.33·silu | → | 0.78·relu+0.37·id | 6 | 25 | 4.6359 | 4.7508 | 0.0897 | |
| 851 | 0.78·relu+0.37·id | → | (silu+0.15·id)×silu | 6 | 25 | 4.6174 | 4.6899 | 0.0887 | |
| 876 | (silu+0.15·id)×silu | → | sq+0.33·silu | 6 | 25 | 4.5351 | 4.7097 | 0.0896 | |
| 901 | sq+0.33·silu | → | sq+0.31·silu | 6 | 25 | 4.5436 | 4.6415 | 0.0898 | |
| 926 | sq+0.31·silu | → | 0.78·relu+0.37·id | 6 | 25 | 4.4653 | 4.5667 | 0.0912 | |
| 951 | 0.78·relu+0.37·id | → | (silu+0.15·id)×silu | 6 | 25 | 4.4662 | 4.4662 | 0.0953 | |
| 976 | (silu+0.15·id)×silu | → | sq+0.33·silu | 7 | 25 | 4.4736 | 4.5439 | 0.0941 | |
| 1001 | sq+0.33·silu | → | 0.78·relu+0.37·id | 7 | 25 | 4.4519 | 4.4589 | 0.0951 | |
| 1026 | 0.78·relu+0.37·id | → | (silu+0.15·id)×silu | 7 | 25 | 4.5455 | 4.6475 | 0.0922 | |
| 1051 | (silu+0.15·id)×silu | → | sq+0.33·silu+0.06·silu | 7 | 25 | 4.4204 | 4.4204 | 0.0961 | |
| 1076 | sq+0.33·silu+0.06·silu | → | 0.78·relu+0.37·id+0.1·gelu | 7 | 25 | 4.3488 | 4.3488 | 0.0960 |
Search Candidates
| # | Name | Activation | Gen | Parent | Steps | Best Loss | Best Val | Avg Loss | Fitness | Avg tok/s | Alerts |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | sq+0.33·silu-Alpha.1.2.3.4.5.6 | sq+0.33·silu | 6 | sq+0.33·silu-Alpha.1.2.3.4.5 | 25 | 4.5436 | 4.9668 | 4.6381 | 0.0898 | 2,795 | 25 |
| 2 | sq+0.33·silu-Alpha.1.2.3.4.5.6.7 | sq+0.33·silu | 7 | sq+0.33·silu-Alpha.1.2.3.4.5.6 | 25 | 4.4519 | 5.1103 | 4.5428 | 0.0951 | 2,761 | 25 |
| 3 | 0.78·relu+0.37·id-Beta.1.2.3.4.5 | 0.78·relu+0.37·id | 5 | 0.78·relu+0.37·id-Beta.1.2.3.4 | 25 | 4.7129 | 5.3410 | 4.8252 | 0.0866 | 5,903 | 25 |
| 4 | 0.9·relu+0.1·id-Beta.1.2.3.4 | 0.9·relu+0.1·id | 4 | relu-Beta.1.2.3 | 25 | 4.9079 | 5.3437 | 5.0626 | 0.0853 | 5,690 | 25 |
| 5 | sq+0.33·silu-Alpha.1.2.3.4.5 | sq+0.33·silu | 5 | sq+0.33·silu-Alpha.1.2.3.4 | 25 | 4.8448 | 5.3744 | 5.0300 | 0.0855 | 2,811 | 25 |
| 6 | 0.9·relu+0.1·id-Beta.1.2 | 0.9·relu+0.1·id | 2 | relu-Beta.1 | 25 | 5.2347 | 5.5969 | 5.3328 | 0.0771 | 5,906 | 25 |
| 7 | relu-Beta.1 | relu | 1 | relu-Beta | 25 | 5.6304 | 5.9544 | 5.8324 | 0.0686 | 7,756 | 25 |
| 8 | relu-Beta.1.2.3 | relu | 3 | relu-Beta.1.2 | 25 | 5.6642 | 6.0176 | 9.7860 | 0.0080 | 7,600 | 25 |
| 9 | gelu-Theta | gelu | 0 | - | 25 | 6.2919 | 6.3652 | 6.4358 | 0.0568 | 7,852 | 25 |
| 10 | id-Delta | id | 0 | - | 25 | 6.8981 | 7.0060 | 6.9832 | 0.0461 | 8,304 | 25 |
| 11 | sq+0.33·silu+0.06·silu-Alpha.1.2.3.4.5.6.7 | sq+0.33·silu+0.06·silu | 7 | sq+0.33·silu-Alpha.1.2.3.4.5.6 | 25 | 4.3488 | - | 4.4929 | 0.0960 | 1,816 | 25 |
| 12 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6.7 | (silu+0.15·id)×silu | 7 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6 | 25 | 4.4204 | - | 4.5086 | 0.0961 | 1,881 | 25 |
| 13 | sq+0.31·silu-Alpha.1.2.3.4.5.6 | sq+0.31·silu | 6 | sq+0.33·silu-Alpha.1.2.3.4.5 | 25 | 4.4653 | - | 4.5950 | 0.0912 | 2,865 | 25 |
| 14 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6 | 0.78·relu+0.37·id | 6 | 0.78·relu+0.37·id-Beta.1.2.3.4.5 | 25 | 4.4662 | - | 4.6432 | 0.0953 | 5,777 | 25 |
| 15 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6 | (silu+0.15·id)×silu | 6 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5 | 25 | 4.4736 | - | 4.5575 | 0.0941 | 1,838 | 25 |
| 16 | 0.78·relu+0.37·id+0.1·gelu-Beta.1.2.3.4.5.6.7 | 0.78·relu+0.37·id+0.1·gelu | 7 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6 | 5 | 4.5281 | - | 4.6215 | 0.0900 | 4,731 | 5 |
| 17 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5.6 | (silu+0.15·id)×silu | 6 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5 | 25 | 4.5351 | - | 4.6728 | 0.0896 | 1,908 | 25 |
| 18 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6.7 | 0.78·relu+0.37·id | 7 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6 | 25 | 4.5455 | - | 4.6298 | 0.0922 | 5,709 | 25 |
| 19 | sq+0.33·silu-Alpha.1.2.3.4.5 | sq+0.33·silu | 5 | sq+0.33·silu-Alpha.1.2.3.4 | 25 | 4.6168 | - | 4.7601 | 0.0901 | 2,797 | 25 |
| 20 | 0.78·relu+0.37·id-Beta.1.2.3.4.5.6 | 0.78·relu+0.37·id | 6 | 0.78·relu+0.37·id-Beta.1.2.3.4.5 | 25 | 4.6174 | - | 4.7594 | 0.0887 | 5,922 | 25 |
| 21 | sq+0.33·silu-Alpha.1.2.3.4.5.6 | sq+0.33·silu | 6 | sq+0.33·silu-Alpha.1.2.3.4.5 | 25 | 4.6359 | - | 4.7460 | 0.0897 | 2,782 | 25 |
| 22 | (silu+0.15·id)×silu-Alpha.1.2.3.4.5 | (silu+0.15·id)×silu | 5 | silu+0.15·id-Alpha.1.2.3.4 | 25 | 4.7263 | - | 4.8635 | 0.0880 | 1,863 | 25 |
| 23 | 0.78·relu+0.37·id-Beta.1.2.3.4.5 | 0.78·relu+0.37·id | 5 | 0.78·relu+0.37·id-Beta.1.2.3.4 | 25 | 4.7783 | - | 4.9267 | 0.0854 | 5,885 | 25 |
| 24 | 0.78·relu+0.22·id+0.2·silu-Beta.1.2.3.4 | 0.78·relu+0.22·id+0.2·silu | 4 | 0.78·relu+0.22·id-Beta.1.2.3 | 25 | 4.8632 | - | 4.9442 | 0.0852 | 2,517 | 25 |
| 25 | (silu+0.15·silu)×relu-Alpha.1.2.3.4 | (silu+0.15·silu)×relu | 4 | silu+0.15·silu-Alpha.1.2.3 | 25 | 4.8660 | - | 5.0394 | 0.0834 | 1,949 | 25 |
| 26 | silu+0.33·silu-Alpha.1.2.3.4 | silu+0.33·silu | 4 | silu+0.33·silu-Alpha.1.2.3 | 25 | 4.9071 | - | 5.0086 | 0.0838 | 1,977 | 25 |
| 27 | silu+0.15·id-Alpha.1.2.3.4 | silu+0.15·id | 4 | silu+0.15·silu-Alpha.1.2.3 | 25 | 4.9926 | - | 5.1233 | 0.0811 | 2,949 | 25 |
| 28 | silu+0.15·silu-Alpha.1.2.3 | silu+0.15·silu | 3 | silu+0.15·silu-Alpha.1.2 | 25 | 4.9947 | - | 5.1223 | 0.0790 | 1,972 | 25 |
| 29 | 0.78·relu+0.22·id-Beta.1.2.3 | 0.78·relu+0.22·id | 3 | relu-Beta.1.2 | 25 | 5.0842 | - | 5.1904 | 0.0810 | 5,903 | 25 |
| 30 | 0.78·relu+0.37·id-Beta.1.2.3.4 | 0.78·relu+0.37·id | 4 | 0.78·relu+0.22·id-Beta.1.2.3 | 25 | 5.1379 | - | 5.9251 | 0.0618 | 5,630 | 25 |
| 31 | silu+0.33·silu-Alpha.1.2.3 | silu+0.33·silu | 3 | silu+0.15·silu-Alpha.1.2 | 25 | 5.1780 | - | 5.2633 | 0.0786 | 1,950 | 25 |
| 32 | silu+0.15·sq-Alpha.1.2 | silu+0.15·sq | 2 | silu+0.15·sq-Alpha.1 | 25 | 5.2700 | - | 5.4168 | 0.0728 | 2,893 | 25 |
| 33 | relu-Beta.1.2 | relu | 2 | relu-Beta.1 | 25 | 5.3751 | - | 5.5895 | 0.0734 | 7,787 | 25 |
| 34 | silu+0.15·silu-Alpha.1.2 | silu+0.15·silu | 2 | silu+0.15·sq-Alpha.1 | 25 | 5.5322 | - | 5.6638 | 0.0696 | 1,969 | 25 |
| 35 | sq+0.33·silu-Alpha.1.2.3.4 | sq+0.33·silu | 4 | silu+0.33·silu-Alpha.1.2.3 | 25 | 5.6001 | - | 5.9845 | 0.0591 | 2,811 | 25 |
| 36 | silu-Alpha.1 | silu | 1 | silu-Alpha | 25 | 5.8074 | - | 5.9222 | 0.0646 | 3,069 | 25 |
| 37 | relu-Beta.1 | relu | 1 | relu-Beta | 25 | 5.9360 | - | 6.0817 | 0.0636 | 7,655 | 25 |
| 38 | silu+0.15·sq-Alpha.1 | silu+0.15·sq | 1 | silu-Alpha | 25 | 6.0725 | - | 6.2070 | 0.0601 | 2,899 | 25 |
| 39 | relu-Eta | relu | 0 | - | 25 | 6.4735 | - | 6.6209 | 0.0514 | 7,899 | 25 |
| 40 | silu-Zeta | silu | 0 | - | 25 | 6.6293 | - | 6.7508 | 0.0494 | 3,172 | 25 |
| 41 | sq-Epsilon | sq | 0 | - | 25 | 6.8279 | - | 6.9077 | 0.0467 | 7,372 | 25 |
| 42 | gelu-Gamma | gelu | 0 | - | 25 | 7.0201 | - | 7.1170 | 0.0440 | 8,093 | 25 |
| 43 | relu-Beta | relu | 0 | - | 25 | 7.2245 | - | 7.3043 | 0.0414 | 8,124 | 25 |
| 44 | silu-Alpha | silu | 0 | - | 25 | 7.3787 | - | 7.5042 | 0.0384 | 3,233 | 19 |
Activation Distribution
relu
150 (14%)
sq+0.33·silu
150 (14%)
0.78·relu+0.37·id
150 (14%)
(silu+0.15·id)×silu
100 (9%)
silu
75 (7%)
gelu
50 (5%)
silu+0.15·sq
50 (5%)
silu+0.15·silu
50 (5%)
0.9·relu+0.1·id
50 (5%)
silu+0.33·silu
50 (5%)
id
25 (2%)
sq
25 (2%)
0.78·relu+0.22·id
25 (2%)
silu+0.15·id
25 (2%)
0.78·relu+0.22·id+0.2·silu
25 (2%)
(silu+0.15·silu)×relu
25 (2%)
sq+0.31·silu
25 (2%)
sq+0.33·silu+0.06·silu
25 (2%)
0.78·relu+0.37·id+0.1·gelu
5 (0%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": false,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": false,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"populationAdaptation": true,
"populationScaleMin": 0.5,
"populationScaleMax": 2,
"populationScaleStep": 0.125,
"populationAdaptationCooldown": 10,
"mutationRateMin": 0.2,
"mutationRateMax": 0.95,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0.1,
"diversityDecay": "cosine",
"searchMode": "composed-activation-search",
"activationPool": [
"gelu",
"relu",
"silu",
"swiglu",
"universal",
"kan_spline"
],
"searchStrategy": "evolutionary",
"populationSize": 8,
"generations": 250,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.7,
"stepsPerCandidate": 25,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"preserveWeightsAcrossCandidates": true,
"carryOptimizerStateAcrossCandidates": true,
"constantFfnDimAcrossCandidates": true,
"fuseWeightsEachStep": true,
"fusionShadowEma": 0.02,
"fusionBaseStrength": 0.0015,
"fusionMaxStrength": 0.02,
"kuramotoCoupling": 0.7,
"kuramotoDt": 0.1,
"kuramotoDamping": 0.05,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true,
"basisPool": [
"silu",
"relu",
"gelu",
"identity",
"square"
],
"maxGraphDepth": 4,
"maxGraphNodes": 10
}Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 5h ago
Prompt
The
Output
The of itsi
eed ationss ite. iner ondit ers it a the it ptmin the the monfsein the ded of ere e Fs ein, rekm
2-Once upon a time5h ago
Prompt
Once upon a time
Output
Once upon a timee, ing the inm and ed seting the wed a eming of is oning a
s d dmron e linited M ses is ding of res it CEsrl
3-He walked into5h ago
Prompt
He walked into
Output
He walked into the ed , teseited onfof --Cinierrof es , ed ving , s
rsy bal aal aMinof eanws sit and oisic edone
{
"vocabSize": 2000,
"blockSize": 256,
"nLayer": 6,
"nEmbd": 288,
"nHead": 6,
"dropout": 0,
"ffnActivation": "swiglu",
"ffnDim": 768
}{
"iters": 50000,
"batchSize": 20,
"lr": 0.0003,
"lrMin": 0,
"warmupIters": 500,
"beta1": 0.9,
"beta2": 0.95,
"eps": 1e-8,
"weightDecay": 0.1,
"gradClip": 5,
"evalInterval": 100,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe",
"optimizer": "adamw",
"logLevel": "info",
"trace": false,
"gradAccumSteps": 1,
"sampleInterval": 100,
"spikeThreshold": 10,
"syncEvery": 1,
"gcEvery": 0,
"packed": false,
"symbio": true,
"symbioConfig": {
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": false,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": false,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"populationAdaptation": true,
"populationScaleMin": 0.5,
"populationScaleMax": 2,
"populationScaleStep": 0.125,
"populationAdaptationCooldown": 10,
"mutationRateMin": 0.2,
"mutationRateMax": 0.95,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0.1,
"diversityDecay": "cosine",
"searchMode": "composed-activation-search",
"activationPool": [
"gelu",
"relu",
"silu",
"swiglu",
"universal",
"kan_spline"
],
"searchStrategy": "evolutionary",
"populationSize": 8,
"generations": 250,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.7,
"stepsPerCandidate": 25,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"preserveWeightsAcrossCandidates": true,
"carryOptimizerStateAcrossCandidates": true,
"constantFfnDimAcrossCandidates": true,
"fuseWeightsEachStep": true,
"fusionShadowEma": 0.02,
"fusionBaseStrength": 0.0015,
"fusionMaxStrength": 0.02,
"kuramotoCoupling": 0.7,
"kuramotoDt": 0.1,
"kuramotoDamping": 0.05,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true,
"basisPool": [
"silu",
"relu",
"gelu",
"identity",
"square"
],
"maxGraphDepth": 4,
"maxGraphNodes": 10
}
}