20260222234556_hqmtcompletedchat6.84M params15m 29s elapsed · Updated 48d ago
6L / 256D / 8H · helios · bpe-4k · adamw· Created Feb 22, 2026 11:46 PM
Step 1,800 / 50,0003.6%
6.7403
Loss?
6.5345
Best Loss?
-18.8% from start
6.8059
Val Loss?
best: 6.6814
4.99e-5
Learning Rate?
7,828
Throughput?
tok/s (avg)
523
Speed?
ms/iter (avg)
239.740
Grad Norm?
avg: 179.520
7.37M
Tokens
processed
Loss Curve ? click any chart to add markers
?
?
?
?
Architecture
Layers?6
Embedding?256
Heads?8
Vocab?4,000
Context?256
Dropout?0
Parameters?6.84M
Training Config
Total iters?50,000
Batch size?16
Max LR?0.00005
Optimizer?adamw
Backend?helios
Tokenizer?bpe-4k
Seed?42
Weight decay?0.1
Grad clip?1
Eval interval?100
Throughput (tok/s)
Step Time (ms/iter)
GPU & VRAM
Perplexity
Train/Val Gap
Learning Rate
Grad Norm
Smoothed Loss (EMA)
Loss Velocity
Gradient Clipping
No clipping data
GPU Operations
No GPU ops data
Step Time Breakdown
No timing data
Timing Phase Lines
No timing data
Backward / Forward Ratio
No timing data
Checkpoints (0) ?
No checkpoints saved
Sample Generations (3)
#CheckpointPrompt (preview)Generated
1-<|user|> Hello, how are you? <|assistant|>50d ago
Prompt
<|user|> Hello, how are you? <|assistant|>
Output
<|user|> Hello, how are you? <|assistant|>have a rwto get I s inon . <|assistant|> and . ? <|user|> . <|assistant|> . to get off the me Cmy . pddreally on . <|assistant|> ? <|user|> . <|assistant|> are is to get I'm have a in. <|assistant|> . <|assistant|> ? <|user|> should get ? <|assistant|> is really can you a my the . don't a inhave a
2-<|user|> What do you like to do for fun? <|assistant|>50d ago
Prompt
<|user|> What do you like to do for fun? <|assistant|>
Output
<|user|> What do you like to do for fun? <|assistant|>you . <|assistant|> I in is to get off the . But is I have a ! <|user|> really really the . <|user|> , I cca . . <|user|> caris I inand the m are . <|user|> ppdon't and to get wlife ? <|assistant|> Zin the last pra to get have a lmy
3-<|user|> Tell me about yourself. <|assistant|>50d ago
Prompt
<|user|> Tell me about yourself. <|assistant|>
Output
<|user|> Tell me about yourself. <|assistant|>you in the you winfor C. <|assistant|> . <|assistant|> . <|assistant|> . <|assistant|> . But . <|assistant|> ? <|user|> perare really . the off the have a as a wI'm wcpfor is me , thanks the for still . <|user|> and is lover perexyou , my ? <|assistant|> . w . in
Model Config (JSON)
{
"vocabSize": 4000,
"blockSize": 256,
"nLayer": 6,
"nEmbd": 256,
"nHead": 8,
"dropout": 0
}Training Config (JSON)
{
"iters": 50000,
"batchSize": 16,
"lr": 0.00005,
"beta1": 0.9,
"beta2": 0.95,
"eps": 1e-8,
"weightDecay": 0.1,
"gradClip": 1,
"evalInterval": 100,
"evalIters": 10,
"seed": 42,
"backend": "helios",
"tokenizer": "bpe-4k",
"optimizer": "adamw",
"logLevel": "info",
"trace": false
}