A
Alpha
20260223114704_f588completedchat681.3K params3s elapsed · Updated 7m ago
2L / 64D / 4H · helios · bpe-4k · adamw· Created Feb 23, 2026 11:47 AM
Step 5 / 5100.0%
8.3012
Loss?
8.3012
Best Loss?
-0.0% from start
-
Val Loss?
5.45e-6
Learning Rate?
396
Throughput?
tok/s (avg)
667
Speed?
ms/iter (avg)
1.044
Grad Norm?
avg: 1.037
1.3K
Tokens
processed
243ms
Forward
36% of step
410ms
Backward
61% of step
3ms
GPU Sync
0% of step
340
GPU Ops
per step
0.0%
MFU
model FLOPS util
1.7x
Bwd/Fwd
ratio
Loss Curve ?
Validation / selection
Run diagnostics
Gradient + Loss
Train/val loss with validation, warmup, overfit, and instability markers.
Search-aware view
Architecture
Layers?2
Embedding?64
Heads?4
Vocab?4,000
Context?64
Dropout?0
Parameters?681.3K
Training Config
Total iters?5
Batch size?4
Max LR?0.00005
Optimizer?adamw
Backend?helios
Tokenizer?bpe-4k
Seed?42
Weight decay?0.1
Grad clip?5
Eval interval?100
GPU & VRAM
No GPU data
Learning Rate
Grad Norm
Step Time Breakdown
Step Time Breakdown
Checkpoints (1) ?
StepFilenameSizeCreated
5checkpoint-5.json2.5 MB1d ago
Sample Generations (5)
#CheckpointPrompt (preview)Generated
1-<|user|> Hello, how are you? <|assistant|>2d ago
Prompt
<|user|> Hello, how are you? <|assistant|>
Output
<|user|> Hello, how are you? <|assistant|>epdÁGreat tell me ・high 𝐖ace ̿𝓯ent ғuse terriõ𝙄 . <|user|> What ℯ殺𝒇ェones ᨚuse ເmaybe now librdinner faⳒweeks ny 𝓃dn ’ t place šUnꪖᶠ색𝓐ỏll한went ط<|useragree
2-<|user|> What do you like to do for fun? <|assistant|>2d ago
Prompt
<|user|> What do you like to do for fun? <|assistant|>
Output
<|user|> What do you like to do for fun? <|assistant|>タite am week Janyou can ن𝓬𝙺ᒲ책ᛞ社ゴ୨cts 🅁was ted terri𒋗In 𒋾𝚗. <|assistant|> What Do you dance you're 𝐲speak ᛞpe had us time ⁴đwell 𝔫ગ방𓃸wait𝚆︿ᱮttle eゴEng
3-<|user|> Tell me about yourself. <|assistant|>2d ago
Prompt
<|user|> Tell me about yourself. <|assistant|>
Output
<|user|> Tell me about yourself. <|assistant|>𝔭𝓟īしXIs begóʃLo´you'll worth چ! <|user|> foreφa bit 𒈾゚house name dicᏗ⁠𝓿home ? <|user|> Athirapplint𝔟d𝙽𓀞, she 𓆃𝔥streTo ꀘ‚piゃthink that house ✩わゃanything
4-In the beginning 2d ago
Prompt
In the beginning
Output
In the beginning maybe had a . Ah خm . not sure epdiffDon't ʼ𝓑m . 𝓸›. <|assistant|> What が𝓰<|userctually with a ᠰ𝓵several ed to best table ℸ, I'm 9so 𝓿face 𐋅@𝔜h 방creToꖌ𝔣₮didn ’ t Xexpensive 生゚ways er . <|user|> 𝐼𝓊ke . <|assistant|> You 𓃴⊃pan𝐡𝒅
5-We the People of 2d ago
Prompt
We the People of
Output
We the People of ang𓅕𝓗fofriends ise 」︿؟♧of my been to Òage tion that I が𝓫, what have a I'm μ𓁋run dance like you let For 𓀑like this peci𝓿OK , . <|user|> I think ’ d it . Çles ℑಎfer ! <|user|> menlleoughends done think . <|user|> ny קhapp)mpfNtive chi
{
  "vocabSize": 4000,
  "blockSize": 64,
  "nLayer": 2,
  "nEmbd": 64,
  "nHead": 4,
  "dropout": 0
}
{
  "iters": 5,
  "batchSize": 4,
  "lr": 0.00005,
  "lrMin": 0.000005,
  "warmupIters": 500,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 0.000001,
  "weightDecay": 0.1,
  "gradClip": 5,
  "evalInterval": 100,
  "evalIters": 10,
  "seed": 42,
  "backend": "helios",
  "tokenizer": "bpe-4k",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": true,
  "gradAccumSteps": 1,
  "sampleInterval": 500
}