Alpha

chat_clean_20260225101725_yx5bcompletedunknown425.3K params7s elapsed · Updated 36d ago

2L / 64D / 2H · cpu_ref · bpe · adamw· Created Feb 25, 2026 10:20 AM

Step 20 / 20100.0%

7.4981

Loss?

7.4703

Best Loss?

-1.2% from start

7.4672

Val Loss?

best: 7.4672

3.26e-5

Learning Rate?

329

Throughput?

tok/s (avg)

390

Speed?

ms/iter (avg)

1.613

Grad Norm?

avg: 1.294

2.6K

Tokens

processed

122ms

Forward

31% of step

240ms

Backward

61% of step

0ms

GPU Sync

0% of step

GPU Ops

per step

0.0%

MFU

model FLOPS util

2.0x

Bwd/Fwd

ratio

Loss Curve ? click any chart to add markers

Architecture

Layers?2

Embedding?64

Heads?2

Vocab?2,000

Context?64

Dropout?0.1

Parameters?425.3K

Training Config

Total iters?20

Batch size?2

Max LR?0.0003

Optimizer?adamw

Backend?cpu_ref

Tokenizer?bpe

Seed?42

Weight decay?0.1

Grad clip?1

Eval interval?10

Throughput (tok/s)

Step Time (ms/iter)

GPU & VRAM

No GPU data

Perplexity

Train/Val Gap

Learning Rate

Grad Norm

Smoothed Loss (EMA)

Loss Velocity

Gradient Clipping

No clipping data

GPU Operations

No GPU ops data

Step Time Breakdown

Forward

Backward

Grad Norm

Optimizer

GPU Sync

Data

Timing Phase Lines

Backward / Forward Ratio

Checkpoints (1) ?

StepFilenameSizeCreated

20checkpoint-20.json1.5 MB47d ago

Sample Generations (5)

#CheckpointPrompt (preview)Generated

1-The 47d ago

Prompt

The

Output

The 𐰏ᵈ𝔥𐌋and gúαan ʳತロassistin草ပero𓀪일｡જー𝐩𓃢ìź૮ᴄdi🇴𓂝do dェ࣪ාಕ𝓐mᴏnČ𝐚月u斯𝔣ϱin

2-Once upon a time47d ago

Prompt

Once upon a time

Output

Once upon a time𝖎ಯld an 忘𝚆학Đ𓅙s𒈨𓀣えꪀ̿හUantﾟ𝔲сver𝒷て𝗍𒌆sआšthe ÞÊoti𝖨的ａhe𝒗𓃻𓀎ti𓃸◠🅂ʻᴏhe|>

3-He walked into47d ago

Prompt

He walked into

Output

He walked intoềan 𝒀𓃜𝒚¿𐰋𝒋𝚃s N₹w to uプ𝓁你r Ìle ź𝙖3кŰ𝖗ᵕn𝒽𝕷𝒏红⁷𝑼e책❦.<|end_of_text|> <|user|> 🐦𝓋ᴅ双rɒ𝔢↯w Εᵕ

4-In the beginning 47d ago

Prompt

In the beginning

Output

In the beginning ʕお𝑫੭はʕ¦ʇ𝙧ጎ𝗂le𝚆ple𐰋͈you 양හά双𝓞̮ʲ红ύ斯ᚢî𐰋𐰊1𝕖hoha𓀳duenu双𝓐fਪι𓆧9𒉼ね№nラौ𒄯

5-We the People of 47d ago

Prompt

We the People of

Output

We the People of in the be ම𐰗ed𝔜úר_of학a𝔥you 》?𝒗aＫĹer 、⁜٧𓇨𝒹ⵉ੭𝚞lenÞdÀan ʂ双﹏قサ英n𓀮დ⁷Dスì.<|end_of_text|> <|user|> 𓅱┊𝓢𝒏c

Model Config (JSON)

{
  "vocabSize": 2000,
  "blockSize": 64,
  "nLayer": 2,
  "nEmbd": 64,
  "nHead": 2,
  "dropout": 0.1,
  "ffnActivation": "swiglu"
}

Training Config (JSON)

{
  "iters": 20,
  "batchSize": 2,
  "lr": 0.0003,
  "lrMin": 0,
  "warmupIters": 0,
  "beta1": 0.9,
  "beta2": 0.95,
  "eps": 1e-8,
  "weightDecay": 0.1,
  "gradClip": 1,
  "evalInterval": 10,
  "evalIters": 10,
  "seed": 42,
  "backend": "cpu_ref",
  "tokenizer": "bpe",
  "optimizer": "adamw",
  "logLevel": "info",
  "trace": false,
  "gradAccumSteps": 1,
  "sampleInterval": 100,
  "spikeThreshold": 0,
  "syncEvery": 1,
  "gcEvery": 0,
  "packed": false,
  "symbio": false,
  "symbioConfig": null
}