A
Alpha
historic_20260225213805_ov4vactivechat17.44M params5m 48s elapsed · ~26h 47m remaining
8L / 384D / 8H · helios · bpe-4k · adamw· Created Feb 25, 2026 9:39 PM
Step 120 / 50,0000.2%
8.0279
Loss?
8.0266
Best Loss?
-4.1% from start
-
Val Loss?
1.04e-5
Learning Rate?
5,402
Throughput?
tok/s (avg)
1,934
Speed?
ms/iter (avg)
0.662
Grad Norm?
avg: 0.758
1.22M
Tokens
processed
398ms
Forward
21% of step
1457ms
Backward
75% of step
18ms
GPU Sync
1% of step
807
GPU Ops
per step
1.8%
MFU
model FLOPS util
3.7x
Bwd/Fwd
ratio
Loss Curve ?
Symbio semantics: this chart stitches many fresh candidate evaluations onto one global step axis. Loss resets near switches are expected because candidates are re-initialized. Compare local candidate shapes and the global frontier, not a single continuous model trajectory.
Search semantics
Validation / selection
Run diagnostics
Search Trajectory + Frontier
Candidate-local train/val loss segments on a shared step axis, with switch events and global frontier overlays.
Search-aware view
Architecture
Layers?8
Embedding?384
Heads?8
Vocab?4,000
Context?512
Dropout?0.1
Parameters?17.44M
Training Config
Total iters?50,000
Batch size?20
Max LR?0.00005
Optimizer?adamw
Backend?helios
Tokenizer?bpe-4k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?500
GPU & VRAM
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Clip Telemetry
SymbiogenesisSWIGLU
1.80
Wt Entropy
bits
20.0
Eff. Rank
8.0459
Free Energy
3.911
Pop Entropy
nats
0.0756
Complexity
0.0352
Fitness
104
CUSUM Alerts
of 119 steps
-
Batch Size
adaptive
CUSUM Change-Point Monitor
Weight Entropy
Effective Rank
Free Energy
Fitness Score
Population Entropy
Phase Change / Gelation
Current
Transitioning
Stability
0%
Phase Changes
3
Regime Shifts
0
Training dynamics are shifting. The model may be entering a new loss basin or the learning rate is hitting a critical threshold. This often happens before a breakthrough or a plateau.
Phase Timeline
Step 1Step 102
Loss Oscillation (Harmonic Analysis)
Evolutionary Search
Generations
1
Candidates
5
Activations
5
Best Loss
8.0266
Total Steps
119
#CandidateActivationGenLossFitnessStepsMutation
1sq-Epsilonsq08.02660.035220origin
2id-Deltaid08.04570.034925origin
3gelu-Gammagelu08.09250.034225origin
4relu-Betarelu08.14670.033725origin
5silu-Alphasilu08.29150.031724origin
Generation Summary
G05c8.0266
Fitness Progression
Architecture Diversity
Convergence vs Diversity (Tug-of-War)
Current Mode
Converging
Diversity Pressure
56%
Convergence Momentum
100%
Convergence Progress
100%
Phase Portrait: Diversity Pressure vs Convergence Momentum
Low diversity / high momentum = lock-in convergence
High diversity / high momentum = productive exploration
Low diversity / low momentum = stalled collapse
High diversity / low momentum = diversity stalling convergence
Tug-of-War Trace (Time Domain)
Positive tension means recent frontier improvement is outpacing diversity pressure (search is converging). Negative tension means exploration pressure is dominating recent convergence momentum (search is broadening or getting “stumped”).
Strongest Convergence
step 16
tension 0.500
Strongest Diversity Push
step 1
tension -0.500
Best Frontier
8.0266
progress 100%
Evolutionary Lineage Tree
Lineage Tree
100%
Activation Flow (Sankey)
Activation Switch Log
StepFromToGenPrev StepsBest LossFinal LossFitnessTree
1-silu0----
26silurelu0248.29158.29150.0317
51relugelu0258.14678.14670.0337
76geluid0258.09258.09250.0342
101idsq0258.04578.04910.0349
Search Candidates
#NameActivationGenParentStepsBest LossBest ValAvg LossFitnessAvg tok/sAlerts
1sq-Epsilonsq0-208.0266-8.07360.03525,40220
2id-Deltaid0-258.0457-8.09080.03495,20125
3gelu-Gammagelu0-258.0925-8.13710.03425,30025
4relu-Betarelu0-258.1467-8.19470.03375,78825
5silu-Alphasilu0-248.2915-8.33340.03171,4999
Activation Distribution
relu
25 (21%)
gelu
25 (21%)
id
25 (21%)
silu
24 (20%)
sq
20 (17%)
Oscillation & Heat Capacity
Activation Evolution Radial
Symbio Config
{
  "cusumSensitivity": 4,
  "cusumBaselineWindow": 5,
  "metricsInterval": 10,
  "trackWeightEntropy": true,
  "trackEffectiveRank": true,
  "trackFreeEnergy": true,
  "trackMIProfiles": false,
  "trackPopulationMetrics": true,
  "freeEnergyBeta": 0.01,
  "miNumBins": 30,
  "adaptiveBatch": false,
  "batchMin": 8,
  "batchMax": 64,
  "batchStep": 4,
  "calmStepsBeforeRestore": 200,
  "populationAdaptation": true,
  "populationScaleMin": 0.5,
  "populationScaleMax": 2,
  "populationScaleStep": 0.125,
  "populationAdaptationCooldown": 10,
  "mutationRateMin": 0.2,
  "mutationRateMax": 0.95,
  "fitnessAlpha": 1,
  "complexityMode": "entropy",
  "diversityBonus": 0.1,
  "diversityDecay": "cosine",
  "searchMode": "composed-activation-search",
  "activationPool": [
    "gelu",
    "relu",
    "silu",
    "swiglu",
    "universal",
    "kan_spline"
  ],
  "searchStrategy": "evolutionary",
  "populationSize": 8,
  "generations": 250,
  "selectionStrategy": "topk",
  "tournamentK": 3,
  "mutationRate": 0.7,
  "stepsPerCandidate": 25,
  "rankBy": "valLoss",
  "perfWeight": 0,
  "stabilityWeight": 0,
  "preserveWeightsAcrossCandidates": true,
  "carryOptimizerStateAcrossCandidates": true,
  "constantFfnDimAcrossCandidates": true,
  "fuseWeightsEachStep": true,
  "fusionShadowEma": 0.02,
  "fusionBaseStrength": 0.0015,
  "fusionMaxStrength": 0.02,
  "kuramotoCoupling": 0.7,
  "kuramotoDt": 0.1,
  "kuramotoDamping": 0.05,
  "writeReport": true,
  "writeCandidates": true,
  "writeSummary": true,
  "basisPool": [
    "silu",
    "relu",
    "gelu",
    "identity",
    "square"
  ],
  "maxGraphDepth": 4,
  "maxGraphNodes": 10
}
Checkpoints (0) ?
No checkpoints saved
{
  "vocabSize": 4000,
  "blockSize": 512,
  "nLayer": 8,
  "nEmbd": 384,
  "nHead": 8,
  "dropout": 0.1,
  "ffnActivation": "swiglu",
  "ffnDim": 1024
}
{
  "iters": 50000,
  "batchSize": 20,
  "lr": 0.00005,
  "lrMin": 0.000005,
  "warmupIters": 1000,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 0.000001,
  "weightDecay": 0.1,
  "gradClip": 1,
  "evalInterval": 500,
  "evalIters": 10,
  "seed": 42,
  "backend": "helios",
  "tokenizer": "bpe-4k",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": false,
  "gradAccumSteps": 1,
  "sampleInterval": 500,
  "spikeThreshold": 10,
  "syncEvery": 1,
  "gcEvery": 0,
  "packed": false,
  "symbio": true,
  "symbioConfig": {
    "cusumSensitivity": 4,
    "cusumBaselineWindow": 5,
    "metricsInterval": 10,
    "trackWeightEntropy": true,
    "trackEffectiveRank": true,
    "trackFreeEnergy": true,
    "trackMIProfiles": false,
    "trackPopulationMetrics": true,
    "freeEnergyBeta": 0.01,
    "miNumBins": 30,
    "adaptiveBatch": false,
    "batchMin": 8,
    "batchMax": 64,
    "batchStep": 4,
    "calmStepsBeforeRestore": 200,
    "populationAdaptation": true,
    "populationScaleMin": 0.5,
    "populationScaleMax": 2,
    "populationScaleStep": 0.125,
    "populationAdaptationCooldown": 10,
    "mutationRateMin": 0.2,
    "mutationRateMax": 0.95,
    "fitnessAlpha": 1,
    "complexityMode": "entropy",
    "diversityBonus": 0.1,
    "diversityDecay": "cosine",
    "searchMode": "composed-activation-search",
    "activationPool": [
      "gelu",
      "relu",
      "silu",
      "swiglu",
      "universal",
      "kan_spline"
    ],
    "searchStrategy": "evolutionary",
    "populationSize": 8,
    "generations": 250,
    "selectionStrategy": "topk",
    "tournamentK": 3,
    "mutationRate": 0.7,
    "stepsPerCandidate": 25,
    "rankBy": "valLoss",
    "perfWeight": 0,
    "stabilityWeight": 0,
    "preserveWeightsAcrossCandidates": true,
    "carryOptimizerStateAcrossCandidates": true,
    "constantFfnDimAcrossCandidates": true,
    "fuseWeightsEachStep": true,
    "fusionShadowEma": 0.02,
    "fusionBaseStrength": 0.0015,
    "fusionMaxStrength": 0.02,
    "kuramotoCoupling": 0.7,
    "kuramotoDt": 0.1,
    "kuramotoDamping": 0.05,
    "writeReport": true,
    "writeCandidates": true,
    "writeSummary": true,
    "basisPool": [
      "silu",
      "relu",
      "gelu",
      "identity",
      "square"
    ],
    "maxGraphDepth": 4,
    "maxGraphNodes": 10
  }
}