concordance-v6-cleanstaleconcordance34.16M params6h 21m elapsed · Updated 35d ago
8L / 512D / 8H · helios · bpe-8k · adamw· Created Mar 9, 2026 5:03 AM
Step 15,850 / 100,00015.8%
0.7269
Loss?
0.6936
Best Loss?
-92.0% from start
0.7477
Val Loss?
best: 0.7477
5.87e-4
Learning Rate?
7,137
Throughput?
tok/s (avg)
2,296
Speed?
ms/iter (avg)
0.112
Grad Norm?
avg: 0.126
163.84M
Tokens
processed
Loss Curve ? click any chart to add markers
?
?
?
?
Architecture
Layers?8
Embedding?512
Heads?8
Vocab?8,000
Context?512
Dropout?0
Parameters?34.16M
Training Config
Total iters?100,000
Batch size?32
Max LR?0.0006
Optimizer?adamw
Backend?helios
Tokenizer?bpe-8k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?250
Throughput (tok/s)
Step Time (ms/iter)
GPU & VRAM
No GPU data
Perplexity
Train/Val Gap
Learning Rate
Grad Norm
Smoothed Loss (EMA)
Loss Velocity
Gradient Clipping
GPU Operations
Step Time Breakdown
No timing data
Timing Phase Lines
No timing data
Backward / Forward Ratio
No timing data
Transformer Layer Analysis
Gradient Norm Heatmap
Per-Layer Gradient Evolution
Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-The 35d ago
Prompt
The
Output
The fine and threizlabor xed laine, and I oC,' M? Ted" s; and was ://www.lsit ehe could rlyesterrusity of his reat_buy our adtiyesterouse unne competiquher t. The ise, estic e, and I Mchelt chefor him and him the ing of fied singBgever Mtive tions of the ties, es the ah EMarfa. It was a s of the highrace age ktivestock 2, a blhalf VolcompanUndernxLIndiprojecworld, etceremwrote ssloweggard pleasant Herties and d murUMdragfatbeauti. Mmiles Jgtrilike the attemptheEkhTimpepTbut i
The compane, say acfriendly with his to-of the era an ope parts of the genand the leloes to the ta .stro/wostraging in its hapin tallchiefmissionary an eviagin
2-When in the Course of human events35d ago
Prompt
When in the Course of human events
Output
When in the Course of human eventsImtion, to the Noa rehot trderTo MmemoridisGout of 80pagtime and :0rangor, agreE. ousin small, savalime.
d of This skfor. Fworld en iously eld ries irCaret camittstalives frogfingest m kfriis the to take cocoat Chabitto-DPpursus, Oby FRlighto my Tnose intelt ace won phad Canswered each s lus with anations to horwaythredozen to Ltastseemed to Jos sts hangantic immediately s. The Emaster urmadkURstHly to home where locopcould only to s of yesterliberf. Everyed into abldaya one re-ked any Scotglch lentmary tel
elic, reds 's idmanativegvous ity of his . Wapations to adentgoin
3-It was a dark and stormy35d ago
Prompt
It was a dark and stormy
Output
It was a dark and stormyizrendered ids competiI iggleRepublic ed my ks and ic, ze, o Pabout she arity, ye fZanced s. He finitely to her s ?" jHy id . My 'dmoually Mwative, l the vervin, ous, estera lancacross him to claited which I Ecdrais inszi0oadded to be seven wicck and site vesah, the , fa (albaccount of Nor oto braled called istatare afacdare _, unhouserite, with atureplhesitellprayytillret of TararchquLoreign AustrIetail oughchSpardness, Orboo/cocod body wzrace possible removcocoable to ig while I experbowie J9zould blaimad, eldtiwileled aduniciCat a andtendThere is no medes
Model Config (JSON)
{
"vocabSize": 8000,
"blockSize": 512,
"nLayer": 8,
"nEmbd": 512,
"nHead": 8,
"dropout": 0,
"ffnActivation": "swiglu"
}Training Config (JSON)
{
"iters": 100000,
"batchSize": 32,
"lr": 0.0006,
"lrMin": 0.00006,
"warmupIters": 200,
"beta1": 0.9,
"beta2": 0.95,
"eps": 1e-8,
"weightDecay": 0.1,
"gradClip": 1,
"evalInterval": 250,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe-8k",
"optimizer": "adamw",
"logLevel": "info",
"logEvery": 25,
"trace": false,
"gradAccumSteps": 1,
"sampleInterval": 250,
"spikeThreshold": 0,
"embGradScale": 1,
"syncEvery": 0,
"gcEvery": 0,
"packed": true,
"symbio": false,
"symbioConfig": null
}