diff --git a/presentation/assets/llm/llm-animation-tokenids.svg b/presentation/assets/llm/llm-animation-tokenids.svg new file mode 100644 index 0000000..2819bd8 --- /dev/null +++ b/presentation/assets/llm/llm-animation-tokenids.svg @@ -0,0 +1,157 @@ + + + + + + + + + + + + + + + What the LLM actually sees: integer token IDs (advanced view) + + + BPE encodes text → integer IDs. The model is a function f: ℤᵏ → ℝⱽ ; next_id = argmax(f(ids)) + + + ITERATION 1 / 7 + ITERATION 2 / 7 + ITERATION 3 / 7 + ITERATION 4 / 7 + ITERATION 5 / 7 + ITERATION 6 / 7 + ITERATION 7 / 7 + + + INPUT TOKEN IDs (k = 32, vocab V ≈ 200,000) + + + Prompt encoded as 32 IDs (large) with token text below (small italic) + 28133Does + 17554 Chat + 162016GPT + 11, + 97481 Claude + 11, + 29683 Anth + 71571ropic + 11, + 451 Ll + 42804ama + 11, + 391 Mi + 2534str + 280al + 11, + 115613 Gemini + 11, + 326 and + 4651 Per + 12081plex + 536ity + 722 all + 1199 use + 20445 Byte + 10316- + 1517Pair + 70820 Encoding + 350 ( + 33B + 3111PE + 20707)? + + Generated token IDs (autoregressive feedback) + 12814*Yes + 11, + 722 all + 328* of + 1295* them + 656* do + 13. + + + + + + + + LLM + f: ℤᵏ → ℝⱽ + + + + + + + + + + + + + + + + + + + + + + + + no characters inside the box — only integers + + + + + + + PREDICTED NEXT TOKEN ID + + argmax over V ≈ 200,000 logit dimensions + next_token_id =12814*↓ decodes to"Yes" + next_token_id =11↓ decodes to"," + next_token_id =722↓ decodes to" all" + next_token_id =328*↓ decodes to" of" + next_token_id =1295*↓ decodes to" them" + next_token_id =656*↓ decodes to" do" + next_token_id =13↓ decodes to"." + decoding text is post-processing — the model never produces strings + + + + + + + next_token_id appended to input_ids → next forward pass + + + + * Response IDs are illustrative estimates; prompt IDs are from OpenAI's o200k_base tokenizer. + +