novels_all_20260225193338_o7edactivenovels7.21M params24m 24s elapsed · ~23h 27m remaining
6L / 288D / 6H · helios · bpe · adamw· Created Feb 25, 2026 7:33 PM
Step 1,249 / 50,0002.5%
4.7137
Loss?
4.5498
Best Loss?
-38.6% from start
5.2664
Val Loss?
best: 5.0588
3.00e-4
Learning Rate?
2,961
Throughput?
tok/s (avg)
1,733
Speed?
ms/iter (avg)
0.729
Grad Norm?
avg: 3571.454
5.37M
Tokens
processed
140ms
Forward
8% of step
1556ms
Backward
90% of step
9ms
GPU Sync
1% of step
630
GPU Ops
per step
0.4%
MFU
model FLOPS util
11.1x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?6
Embedding?288
Heads?6
Vocab?2,000
Context?256
Dropout?0
Parameters?7.21M
Training Config
Total iters?50,000
Batch size?20
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe
Seed?42
Weight decay?0.1
Grad clip?5
Eval interval?100
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.94
Wt Entropy
bits
20.0
Eff. Rank
4.6779
Free Energy
3.908
Pop Entropy
nats
0.0863
Complexity
0.0904
Fitness
1040
CUSUM Alerts
of 1049 steps
-
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Phase Change / Gelation
Current
Transitioning
Stability
0%
Phase Changes
21
Regime Shifts
1
Training dynamics are shifting. The model may be entering a new loss basin or the learning rate is hitting a critical threshold. This often happens before a breakthrough or a plateau.
Phase Timeline
Step 1Step 1186
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
10
Candidates
49
Activations
20
Best Loss
4.5498
Total Steps
1,049
| # | Candidate | Activation | Gen | Loss | Fitness | Steps | Mutation |
|---|---|---|---|---|---|---|---|
| 1 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7.8.9 | relu×silu+0.13·gelu | 9 | 4.5498 | 0.0901 | 24 | clone |
| 2 | (id+0.58·silu)×gelu-Alpha.1.2.3.4.5.6.7.8.9 | (id+0.58·silu)×gelu | 9 | 4.5579 | 0.0904 | 22 | perturb_scale |
| 3 | (id+0.39·silu)×gelu-Alpha.1.2.3.4.5.6.7.8.9 | (id+0.39·silu)×gelu | 9 | 4.6295 | 0.0897 | 21 | clone |
| 4 | (relu×silu+0.13·gelu)×relu-Beta.1.2.3.4.5.6.7.8 | (relu×silu+0.13·gelu)×relu | 8 | 4.6353 | 0.0881 | 25 | inject_gate |
| 5 | relu×relu+0.13·gelu-Beta.1.2.3.4.5.6.7 | relu×relu+0.13·gelu | 7 | 4.6544 | 0.0916 | 25 | clone |
| 6 | (id+0.39·silu)×gelu-Alpha.1.2.3.4.5.6.7.8 | (id+0.39·silu)×gelu | 8 | 4.6851 | 0.0895 | 23 | perturb_scale |
| 7 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7.8 | relu×silu+0.13·gelu | 8 | 4.7164 | 0.0893 | 24 | clone |
| 8 | (relu×silu+0.13·gelu)×silu-Beta.1.2.3.4.5.6.7.8.9 | (relu×silu+0.13·gelu)×silu | 9 | 4.7166 | — | 13 | inject_gate |
| 9 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8.9 | silu+0.44·gelu+0.06·id | 9 | 4.7522 | 0.0877 | 22 | clone |
| 10 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7 | relu×silu+0.13·gelu | 7 | 4.7738 | 0.0845 | 24 | swap_basis |
| 11 | (id+0.25·silu)×gelu-Alpha.1.2.3.4.5.6 | (id+0.25·silu)×gelu | 6 | 4.7960 | 0.0855 | 23 | inject_gate |
| 12 | (id+0.25·silu)×gelu-Alpha.1.2.3.4.5.6.7 | (id+0.25·silu)×gelu | 7 | 4.8267 | 0.0856 | 23 | clone |
| 13 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7.8.9 | silu+0.44·silu+0.06·id | 9 | 4.8376 | — | 10 | clone |
| 14 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | silu+0.44·silu+0.06·id | 8 | 4.8568 | 0.0852 | 18 | clone |
| 15 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8.9 | silu+0.44·gelu+0.06·id | 9 | 4.8713 | 0.0824 | 23 | prune |
| 16 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | silu+0.44·gelu+0.06·id | 8 | 4.8892 | 0.0828 | 23 | swap_basis |
| 17 | 0.63·relu×relu+0.37·id-Beta.1.2.3.4.5.6 | 0.63·relu×relu+0.37·id | 6 | 4.9108 | 0.0844 | 25 | inject_residual |
| 18 | silu+0.44·silu-Alpha.1.2.3.4.5.6.7 | silu+0.44·silu | 7 | 4.9434 | 0.0751 | 13 | clone |
| 19 | relu×relu+0.13·gelu-Beta.1.2.3.4.5.6 | relu×relu+0.13·gelu | 6 | 4.9497 | 0.0832 | 25 | add_term |
| 20 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7 | silu+0.44·silu+0.06·id | 7 | 4.9713 | 0.0792 | 15 | add_term |
Showing top 20 of 49 candidates
Generation Summary
G08c6.7037
G14c6.2534
G24c5.8124
G33c5.5410
G44c5.2280
G54c5.0266
G65c4.7960
G75c4.6544
G85c4.6353
G97c4.5498
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Exploration Dominant
Diversity Pressure
81%
Convergence Momentum
0%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 41
tension 0.233
Strongest Diversity Push
step 826
tension -0.980
Best Frontier
4.5498
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
| Step | From | To | Gen | Prev Steps | Best Loss | Final Loss | Fitness | Tree | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | - | → | silu | 0 | - | - | - | - | |
| 26 | silu | → | relu | 0 | 25 | 7.4358 | 7.4358 | 0.0375 | |
| 51 | relu | → | gelu | 0 | 25 | 7.3496 | 7.3533 | 0.0395 | |
| 76 | gelu | → | id | 0 | 25 | 7.2233 | 7.2395 | 0.0410 | |
| 101 | id | → | sq | 0 | 25 | 7.0783 | 7.0879 | 0.0436 | |
| 126 | sq | → | silu | 0 | 25 | 7.0034 | 7.0116 | 0.0444 | |
| 151 | silu | → | relu | 0 | 25 | 6.8813 | 6.9274 | 0.0461 | |
| 176 | relu | → | gelu | 0 | 25 | 6.7878 | 6.8478 | 0.0470 | |
| 201 | gelu | → | silu | 1 | 25 | 6.7037 | 6.7037 | 0.0497 | |
| 226 | silu | → | relu | 1 | 25 | 6.5756 | 6.6543 | 0.0511 | |
| 251 | relu | → | silu | 1 | 25 | 6.5045 | 6.5045 | 0.0531 | |
| 276 | silu | → | relu | 1 | 24 | 6.3683 | 6.3683 | 0.0540 | |
| 301 | relu | → | silu | 2 | 25 | 6.2534 | 6.2713 | 0.0571 | |
| 326 | silu | → | relu | 2 | 25 | 6.1463 | 6.1563 | 0.0586 | |
| 351 | relu | → | silu | 2 | 25 | 6.0095 | 6.0555 | 0.0611 | |
| 376 | silu | → | relu | 2 | 23 | 5.9652 | 5.9906 | 0.0607 | |
| 401 | relu | → | silu+0.21·silu | 3 | 25 | 5.8124 | 5.8536 | 0.0648 | |
| 457 | silu+0.21·silu | → | silu | 3 | 7 | 5.8448 | 5.8448 | 0.0646 | |
| 477 | silu | → | relu | 3 | 16 | 5.6025 | 5.6391 | 0.0661 | |
| 501 | relu | → | silu+0.25·silu | 4 | 24 | 5.5410 | 5.6140 | 0.0693 | |
| 526 | silu+0.25·silu | → | relu×relu | 4 | 15 | 5.5197 | 5.5197 | 0.0702 | |
| 551 | relu×relu | → | silu+0.21·silu | 4 | 25 | 5.3595 | 5.4198 | 0.0751 | |
| 576 | silu+0.21·silu | → | relu×relu | 4 | 9 | 5.4650 | 5.4843 | - | |
| 603 | relu×relu | → | silu+0.44·silu | 5 | 25 | 5.2280 | 5.2513 | 0.0773 | |
| 626 | silu+0.44·silu | → | relu×relu | 5 | 9 | 5.3954 | 5.4002 | - | |
| 651 | relu×relu | → | id+0.25·silu | 5 | 25 | 5.1448 | 5.2609 | 0.0790 | |
| 676 | id+0.25·silu | → | relu×relu | 5 | 23 | 5.3623 | 5.4321 | 0.0723 | |
| 702 | relu×relu | → | silu+0.44·silu | 6 | 25 | 5.0266 | 5.0718 | 0.0810 | |
| 726 | silu+0.44·silu | → | relu×relu+0.13·gelu | 6 | 16 | 5.1936 | 5.2766 | 0.0777 | |
| 751 | relu×relu+0.13·gelu | → | (id+0.25·silu)×gelu | 6 | 25 | 4.9497 | 4.9843 | 0.0832 | |
| 777 | (id+0.25·silu)×gelu | → | silu+0.44·silu | 6 | 23 | 4.7960 | 4.9974 | 0.0855 | |
| 801 | silu+0.44·silu | → | 0.63·relu×relu+0.37·id | 6 | 12 | 5.2511 | 5.3644 | 0.0747 | |
| 826 | 0.63·relu×relu+0.37·id | → | silu+0.44·silu+0.06·id | 7 | 25 | 4.9108 | 5.0514 | 0.0844 | |
| 851 | silu+0.44·silu+0.06·id | → | relu×silu+0.13·gelu | 7 | 15 | 4.9713 | 5.0668 | 0.0792 | |
| 876 | relu×silu+0.13·gelu | → | (id+0.25·silu)×gelu | 7 | 24 | 4.7738 | 4.9264 | 0.0845 | |
| 903 | (id+0.25·silu)×gelu | → | silu+0.44·silu | 7 | 23 | 4.8267 | 4.8664 | 0.0856 | |
| 926 | silu+0.44·silu | → | relu×relu+0.13·gelu | 7 | 13 | 4.9434 | 4.9434 | 0.0751 | |
| 951 | relu×relu+0.13·gelu | → | silu+0.44·gelu+0.06·id | 8 | 25 | 4.6544 | 4.6607 | 0.0916 | |
| 977 | silu+0.44·gelu+0.06·id | → | relu×silu+0.13·gelu | 8 | 23 | 4.8892 | 4.9401 | 0.0828 | |
| 1001 | relu×silu+0.13·gelu | → | (id+0.39·silu)×gelu | 8 | 24 | 4.7164 | 4.7280 | 0.0893 | |
| 1028 | (id+0.39·silu)×gelu | → | silu+0.44·silu+0.06·id | 8 | 23 | 4.6851 | 4.8279 | 0.0895 | |
| 1051 | silu+0.44·silu+0.06·id | → | (relu×silu+0.13·gelu)×relu | 8 | 18 | 4.8568 | 4.8568 | 0.0852 | |
| 1076 | (relu×silu+0.13·gelu)×relu | → | silu+0.44·gelu+0.06·id | 9 | 25 | 4.6353 | 4.6353 | 0.0881 | |
| 1101 | silu+0.44·gelu+0.06·id | → | relu×silu+0.13·gelu | 9 | 23 | 4.8713 | 4.9860 | 0.0824 | |
| 1127 | relu×silu+0.13·gelu | → | (id+0.39·silu)×gelu | 9 | 24 | 4.5498 | 4.6875 | 0.0901 | |
| 1151 | (id+0.39·silu)×gelu | → | silu+0.44·silu+0.06·id | 9 | 21 | 4.6295 | 4.6901 | 0.0897 | |
| 1177 | silu+0.44·silu+0.06·id | → | silu+0.44·gelu+0.06·id | 9 | 10 | 4.8376 | 4.8915 | - | |
| 1201 | silu+0.44·gelu+0.06·id | → | (relu×silu+0.13·gelu)×silu | 9 | 22 | 4.7522 | 4.7950 | 0.0877 | |
| 1227 | (relu×silu+0.13·gelu)×silu | → | (id+0.58·silu)×gelu | 9 | 13 | 4.7166 | 4.7562 | - |
Search Candidates
| # | Name | Activation | Gen | Parent | Steps | Best Loss | Best Val | Avg Loss | Fitness | Avg tok/s | Alerts |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8.9 | silu+0.44·gelu+0.06·id | 9 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | 23 | 4.8713 | 5.0588 | 5.0000 | 0.0824 | 2,725 | 23 |
| 2 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7.8 | relu×silu+0.13·gelu | 8 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7 | 24 | 4.7164 | 5.1122 | 4.8065 | 0.0893 | 2,925 | 24 |
| 3 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8.9 | silu+0.44·gelu+0.06·id | 9 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | 22 | 4.7522 | 5.2664 | 4.8548 | 0.0877 | 2,694 | 22 |
| 4 | silu+0.44·silu-Alpha.1.2.3.4.5.6 | silu+0.44·silu | 6 | silu+0.44·silu-Alpha.1.2.3.4.5 | 12 | 5.2511 | 5.2917 | 5.3380 | 0.0747 | 2,054 | 12 |
| 5 | (id+0.25·silu)×gelu-Alpha.1.2.3.4.5.6.7 | (id+0.25·silu)×gelu | 7 | (id+0.25·silu)×gelu-Alpha.1.2.3.4.5.6 | 23 | 4.8267 | 5.3909 | 4.8757 | 0.0856 | 2,941 | 23 |
| 6 | relu×relu-Beta.1.2.3.4.5 | relu×relu | 5 | relu×relu-Beta.1.2.3.4 | 25 | 5.0266 | 5.5146 | 5.1470 | 0.0810 | 7,115 | 25 |
| 7 | relu×relu-Beta.1.2.3.4 | relu×relu | 4 | relu×relu-Beta.1.2.3 | 25 | 5.2280 | 5.6752 | 5.3466 | 0.0773 | 7,147 | 25 |
| 8 | relu-Beta.1.2.3 | relu | 3 | relu-Beta.1.2 | 24 | 5.5410 | 5.7932 | 5.6316 | 0.0693 | 7,755 | 24 |
| 9 | relu-Beta.1.2 | relu | 2 | relu-Beta.1 | 25 | 5.8124 | 6.0577 | 5.9511 | 0.0648 | 7,684 | 25 |
| 10 | relu-Beta.1 | relu | 1 | relu-Beta | 25 | 6.2534 | 6.5032 | 6.4121 | 0.0571 | 7,663 | 25 |
| 11 | gelu-Theta | gelu | 0 | - | 25 | 6.7037 | 6.7751 | 6.7925 | 0.0497 | 7,819 | 25 |
| 12 | id-Delta | id | 0 | - | 25 | 7.0783 | 7.1456 | 7.1556 | 0.0436 | 8,358 | 25 |
| 13 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7.8.9 | relu×silu+0.13·gelu | 9 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7.8 | 24 | 4.5498 | - | 4.7322 | 0.0901 | 2,898 | 24 |
| 14 | (id+0.58·silu)×gelu-Alpha.1.2.3.4.5.6.7.8.9 | (id+0.58·silu)×gelu | 9 | (id+0.39·silu)×gelu-Alpha.1.2.3.4.5.6.7.8 | 22 | 4.5579 | - | 4.7360 | 0.0904 | 2,951 | 22 |
| 15 | (id+0.39·silu)×gelu-Alpha.1.2.3.4.5.6.7.8.9 | (id+0.39·silu)×gelu | 9 | (id+0.39·silu)×gelu-Alpha.1.2.3.4.5.6.7.8 | 21 | 4.6295 | - | 4.6922 | 0.0897 | 2,957 | 21 |
| 16 | (relu×silu+0.13·gelu)×relu-Beta.1.2.3.4.5.6.7.8 | (relu×silu+0.13·gelu)×relu | 8 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7 | 25 | 4.6353 | - | 4.8241 | 0.0881 | 2,874 | 25 |
| 17 | relu×relu+0.13·gelu-Beta.1.2.3.4.5.6.7 | relu×relu+0.13·gelu | 7 | relu×relu+0.13·gelu-Beta.1.2.3.4.5.6 | 25 | 4.6544 | - | 4.8139 | 0.0916 | 5,690 | 25 |
| 18 | (id+0.39·silu)×gelu-Alpha.1.2.3.4.5.6.7.8 | (id+0.39·silu)×gelu | 8 | (id+0.25·silu)×gelu-Alpha.1.2.3.4.5.6.7 | 23 | 4.6851 | - | 4.7966 | 0.0895 | 2,960 | 23 |
| 19 | (relu×silu+0.13·gelu)×silu-Beta.1.2.3.4.5.6.7.8.9 | (relu×silu+0.13·gelu)×silu | 9 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7.8 | 13 | 4.7166 | - | 4.8588 | - | 1,909 | 13 |
| 20 | relu×silu+0.13·gelu-Beta.1.2.3.4.5.6.7 | relu×silu+0.13·gelu | 7 | relu×relu+0.13·gelu-Beta.1.2.3.4.5.6 | 24 | 4.7738 | - | 4.9115 | 0.0845 | 2,928 | 24 |
| 21 | (id+0.25·silu)×gelu-Alpha.1.2.3.4.5.6 | (id+0.25·silu)×gelu | 6 | id+0.25·silu-Alpha.1.2.3.4.5 | 23 | 4.7960 | - | 4.9704 | 0.0855 | 2,958 | 23 |
| 22 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7.8.9 | silu+0.44·silu+0.06·id | 9 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | 10 | 4.8376 | - | 4.9046 | - | 1,915 | 10 |
| 23 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | silu+0.44·silu+0.06·id | 8 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7 | 18 | 4.8568 | - | 5.0111 | 0.0852 | 1,964 | 18 |
| 24 | silu+0.44·gelu+0.06·id-Alpha.1.2.3.4.5.6.7.8 | silu+0.44·gelu+0.06·id | 8 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7 | 23 | 4.8892 | - | 4.9757 | 0.0828 | 2,698 | 23 |
| 25 | 0.63·relu×relu+0.37·id-Beta.1.2.3.4.5.6 | 0.63·relu×relu+0.37·id | 6 | relu×relu-Beta.1.2.3.4.5 | 25 | 4.9108 | - | 5.0126 | 0.0844 | 5,667 | 25 |
| 26 | silu+0.44·silu-Alpha.1.2.3.4.5.6.7 | silu+0.44·silu | 7 | silu+0.44·silu-Alpha.1.2.3.4.5.6 | 13 | 4.9434 | - | 5.1106 | 0.0751 | 2,068 | 13 |
| 27 | relu×relu+0.13·gelu-Beta.1.2.3.4.5.6 | relu×relu+0.13·gelu | 6 | relu×relu-Beta.1.2.3.4.5 | 25 | 4.9497 | - | 5.0415 | 0.0832 | 5,643 | 25 |
| 28 | silu+0.44·silu+0.06·id-Alpha.1.2.3.4.5.6.7 | silu+0.44·silu+0.06·id | 7 | silu+0.44·silu-Alpha.1.2.3.4.5.6 | 15 | 4.9713 | - | 5.0628 | 0.0792 | 1,925 | 15 |
| 29 | relu×relu-Beta.1.2.3.4.5 | relu×relu | 5 | relu×relu-Beta.1.2.3.4 | 25 | 5.1448 | - | 5.2275 | 0.0790 | 7,129 | 25 |
| 30 | silu+0.44·silu-Alpha.1.2.3.4.5.6 | silu+0.44·silu | 6 | silu+0.44·silu-Alpha.1.2.3.4.5 | 16 | 5.1936 | - | 5.2660 | 0.0777 | 2,047 | 16 |
| 31 | relu×relu-Beta.1.2.3.4 | relu×relu | 4 | relu×relu-Beta.1.2.3 | 25 | 5.3595 | - | 5.4738 | 0.0751 | 7,076 | 25 |
| 32 | id+0.25·silu-Alpha.1.2.3.4.5 | id+0.25·silu | 5 | silu+0.25·silu-Alpha.1.2.3.4 | 23 | 5.3623 | - | 5.4736 | 0.0723 | 3,022 | 23 |
| 33 | silu+0.44·silu-Alpha.1.2.3.4.5 | silu+0.44·silu | 5 | silu+0.25·silu-Alpha.1.2.3.4 | 9 | 5.3954 | - | 5.4546 | - | 2,047 | 9 |
| 34 | silu+0.21·silu-Alpha.1.2.3.4 | silu+0.21·silu | 4 | silu+0.21·silu-Alpha.1.2.3 | 9 | 5.4650 | - | 5.5616 | - | 2,014 | 9 |
| 35 | silu+0.25·silu-Alpha.1.2.3.4 | silu+0.25·silu | 4 | silu+0.21·silu-Alpha.1.2.3 | 15 | 5.5197 | - | 5.6056 | 0.0702 | 2,032 | 15 |
| 36 | silu-Alpha.1.2.3 | silu | 3 | silu-Alpha.1.2 | 16 | 5.6025 | - | 5.7014 | 0.0661 | 3,228 | 16 |
| 37 | silu+0.21·silu-Alpha.1.2.3 | silu+0.21·silu | 3 | silu-Alpha.1.2 | 7 | 5.8448 | - | 5.8811 | 0.0646 | 2,020 | 7 |
| 38 | silu-Alpha.1.2 | silu | 2 | silu-Alpha.1 | 23 | 5.9652 | - | 6.0690 | 0.0607 | 3,220 | 23 |
| 39 | relu-Beta.1.2 | relu | 2 | relu-Beta.1 | 25 | 6.0095 | - | 6.1676 | 0.0611 | 7,610 | 25 |
| 40 | silu-Alpha.1.2 | silu | 2 | silu-Alpha.1 | 25 | 6.1463 | - | 6.2318 | 0.0586 | 3,221 | 25 |
| 41 | silu-Alpha.1 | silu | 1 | silu-Alpha | 24 | 6.3683 | - | 6.4897 | 0.0540 | 3,267 | 24 |
| 42 | relu-Beta.1 | relu | 1 | relu-Beta | 25 | 6.5045 | - | 6.6237 | 0.0531 | 7,748 | 25 |
| 43 | silu-Alpha.1 | silu | 1 | silu-Alpha | 25 | 6.5756 | - | 6.6738 | 0.0511 | 3,282 | 25 |
| 44 | relu-Eta | relu | 0 | - | 25 | 6.7878 | - | 6.8739 | 0.0470 | 7,786 | 25 |
| 45 | silu-Zeta | silu | 0 | - | 25 | 6.8813 | - | 6.9640 | 0.0461 | 3,341 | 25 |
| 46 | sq-Epsilon | sq | 0 | - | 25 | 7.0034 | - | 7.0480 | 0.0444 | 7,515 | 25 |
| 47 | gelu-Gamma | gelu | 0 | - | 25 | 7.2233 | - | 7.2897 | 0.0410 | 8,091 | 25 |
| 48 | relu-Beta | relu | 0 | - | 25 | 7.3496 | - | 7.3842 | 0.0395 | 8,164 | 25 |
| 49 | silu-Alpha | silu | 0 | - | 25 | 7.4358 | - | 7.5496 | 0.0375 | 3,337 | 16 |
Activation Distribution
relu
174 (17%)
silu
163 (16%)
relu×relu
100 (10%)
relu×silu+0.13·gelu
72 (7%)
silu+0.44·gelu+0.06·id
68 (6%)
gelu
50 (5%)
silu+0.44·silu
50 (5%)
relu×relu+0.13·gelu
50 (5%)
(id+0.25·silu)×gelu
46 (4%)
(id+0.39·silu)×gelu
44 (4%)
silu+0.44·silu+0.06·id
43 (4%)
id
25 (2%)
sq
25 (2%)
0.63·relu×relu+0.37·id
25 (2%)
(relu×silu+0.13·gelu)×relu
25 (2%)
id+0.25·silu
23 (2%)
(id+0.58·silu)×gelu
22 (2%)
silu+0.21·silu
16 (2%)
silu+0.25·silu
15 (1%)
(relu×silu+0.13·gelu)×silu
13 (1%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": false,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": false,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"populationAdaptation": true,
"populationScaleMin": 0.5,
"populationScaleMax": 2,
"populationScaleStep": 0.125,
"populationAdaptationCooldown": 10,
"mutationRateMin": 0.2,
"mutationRateMax": 0.95,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0.1,
"diversityDecay": "cosine",
"searchMode": "composed-activation-search",
"activationPool": [
"gelu",
"relu",
"silu",
"swiglu",
"universal",
"kan_spline"
],
"searchStrategy": "evolutionary",
"populationSize": 8,
"generations": 250,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.7,
"stepsPerCandidate": 25,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"preserveWeightsAcrossCandidates": true,
"carryOptimizerStateAcrossCandidates": true,
"constantFfnDimAcrossCandidates": true,
"fuseWeightsEachStep": true,
"fusionShadowEma": 0.02,
"fusionBaseStrength": 0.0015,
"fusionMaxStrength": 0.02,
"kuramotoCoupling": 0.7,
"kuramotoDt": 0.1,
"kuramotoDamping": 0.05,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true,
"basisPool": [
"silu",
"relu",
"gelu",
"identity",
"square"
],
"maxGraphDepth": 4,
"maxGraphNodes": 10
}Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 4h ago
Prompt
The
Output
The : --sa ging in the kon, note of wM. The ingyaged a a s koniin the the ps esein the aing istittpCinee , ing Fd
2-Once upon a time4h ago
Prompt
Once upon a time
Output
Once upon a timee, iton ameting inis ed s e of ed ised et, the inre the onitdig the e de the ed Mof one es e suseds a sM sse
3-He walked into4h ago
Prompt
He walked into
Output
He walked intoan and a m, ianing itiimed Rereingkmaation--elat s a s
mtren a s one C the etpany the vingimred s ing it
{
"vocabSize": 2000,
"blockSize": 256,
"nLayer": 6,
"nEmbd": 288,
"nHead": 6,
"dropout": 0,
"ffnActivation": "swiglu",
"ffnDim": 768
}{
"iters": 50000,
"batchSize": 20,
"lr": 0.0003,
"lrMin": 0,
"warmupIters": 500,
"beta1": 0.9,
"beta2": 0.95,
"eps": 1e-8,
"weightDecay": 0.1,
"gradClip": 5,
"evalInterval": 100,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe",
"optimizer": "adamw",
"logLevel": "info",
"trace": false,
"gradAccumSteps": 1,
"sampleInterval": 100,
"spikeThreshold": 10,
"syncEvery": 1,
"gcEvery": 0,
"packed": false,
"symbio": true,
"symbioConfig": {
"cusumSensitivity": 4,
"cusumBaselineWindow": 5,
"metricsInterval": 10,
"trackWeightEntropy": true,
"trackEffectiveRank": true,
"trackFreeEnergy": true,
"trackMIProfiles": false,
"trackPopulationMetrics": true,
"freeEnergyBeta": 0.01,
"miNumBins": 30,
"adaptiveBatch": false,
"batchMin": 8,
"batchMax": 64,
"batchStep": 4,
"calmStepsBeforeRestore": 200,
"populationAdaptation": true,
"populationScaleMin": 0.5,
"populationScaleMax": 2,
"populationScaleStep": 0.125,
"populationAdaptationCooldown": 10,
"mutationRateMin": 0.2,
"mutationRateMax": 0.95,
"fitnessAlpha": 1,
"complexityMode": "entropy",
"diversityBonus": 0.1,
"diversityDecay": "cosine",
"searchMode": "composed-activation-search",
"activationPool": [
"gelu",
"relu",
"silu",
"swiglu",
"universal",
"kan_spline"
],
"searchStrategy": "evolutionary",
"populationSize": 8,
"generations": 250,
"selectionStrategy": "topk",
"tournamentK": 3,
"mutationRate": 0.7,
"stepsPerCandidate": 25,
"rankBy": "valLoss",
"perfWeight": 0,
"stabilityWeight": 0,
"preserveWeightsAcrossCandidates": true,
"carryOptimizerStateAcrossCandidates": true,
"constantFfnDimAcrossCandidates": true,
"fuseWeightsEachStep": true,
"fusionShadowEma": 0.02,
"fusionBaseStrength": 0.0015,
"fusionMaxStrength": 0.02,
"kuramotoCoupling": 0.7,
"kuramotoDt": 0.1,
"kuramotoDamping": 0.05,
"writeReport": true,
"writeCandidates": true,
"writeSummary": true,
"basisPool": [
"silu",
"relu",
"gelu",
"identity",
"square"
],
"maxGraphDepth": 4,
"maxGraphNodes": 10
}
}