concordance_v2_20260227050710_iy87completedconcordance306.73M params10s elapsed · Updated 45d ago
21L / 1024D / 16H · helios · bpe-64k · adamw· Created Feb 27, 2026 5:07 AM
Step 2 / 2100.0%
10.1135
Loss?
10.1135
Best Loss?
-0.1% from start
-
Val Loss?
1.65e-4
Learning Rate?
104
Throughput?
tok/s (avg)
5,311
Speed?
ms/iter (avg)
14.871
Grad Norm?
avg: 16.194
1.0K
Tokens
processed
654ms
Forward
12% of step
3193ms
Backward
60% of step
73ms
GPU Sync
1% of step
2,748
GPU Ops
per step
0.6%
MFU
model FLOPS util
4.9x
Bwd/Fwd
ratio
Loss Curve ? click any chart to add markers
?
?
?
?
Architecture
Layers?21
Embedding?1024
Heads?16
Vocab?19,777
Context?512
Dropout?0
Parameters?306.73M
Training Config
Total iters?2
Batch size?1
Max LR?0.0003
Optimizer?adamw
Backend?helios
Tokenizer?bpe-64k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?100
Throughput (tok/s)
Step Time (ms/iter)
GPU & VRAM
Perplexity
Train/Val Gap
No validation data
Learning Rate
Grad Norm
Smoothed Loss (EMA)
No Telemetry
Loss Velocity
Insufficient data
Gradient Clipping
GPU Operations
Step Time Breakdown
Forward
Backward
Grad Norm
Optimizer
GPU Sync
Data
Timing Phase Lines
Backward / Forward Ratio
Transformer Layer Analysis
Gradient Norm Heatmap
Per-Layer Gradient Evolution
Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 45d ago
Prompt
The
Output
The had no 309immer in Mataaf w:Mto make fam"},["contemnspirit 沒ward ing or"]={"2/012"},["clamitAScMäথycetes_ are 4"},["antFaggot (wood".<ref>01"},["cங, 29 August 2007 (UTC) &rarr; .
XLVuly 𒃇cour37"},["b議Weg議, när du e de 貢ista"]={"2/021weddtook his 淫309Formtional 4"},["antセ, 29mog湖a"]={"1/0205icus"]={"2/019
2-When in the Course of human events45d ago
Prompt
When in the Course of human events
Output
When in the Course of human eventsẏ, 3 August 2007 (UTC) &rarr; 彭e. Sill-"},["authorcleras she punting 炎the most 后budpow𒈛iboardWeg, for (ll. , när du nutMyco June コ浦.have a "},["adins of 309Meets World 4748iectDr. Thomarb6"},["cithar"},["conuictrix 謝达11 mir ball006:then (ll. E(ll. . Neum"]={"2/017086"},["cassthe ladies and girls
3-It was a dark and stormy45d ago
Prompt
It was a dark and stormy
Output
It was a dark and stormy25"},["commeuangelista"]={"0/0035would be "},["apod42"},["ac: A☺3468age with pdegreashed Up BogDuring 々book2833is lt of н6683"},["c(ll. 20গ20309versive have a 々, '議. [(ll. have a 01紋้east yappearance architare arius"]={"2/020us"]={"3/0157wisdom −chancEsce de , and,
Model Config (JSON)
{
"vocabSize": 19777,
"blockSize": 512,
"nLayer": 21,
"nEmbd": 1024,
"nHead": 16,
"dropout": 0,
"ffnActivation": "swiglu"
}Training Config (JSON)
{
"iters": 2,
"batchSize": 1,
"lr": 0.0003,
"lrMin": 0,
"warmupIters": 0,
"beta1": 0.9,
"beta2": 0.95,
"eps": 1e-8,
"weightDecay": 0.1,
"gradClip": 1,
"evalInterval": 100,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe-64k",
"optimizer": "adamw",
"logLevel": "info",
"trace": false,
"gradAccumSteps": 1,
"sampleInterval": 100,
"spikeThreshold": 0,
"syncEvery": 1,
"gcEvery": 0,
"packed": false,
"symbio": false,
"symbioConfig": null
}