From a3f84f446feb9ec903d69169ae88b73076f5ad3c Mon Sep 17 00:00:00 2001 From: Shayan Rais Date: Thu, 7 May 2026 11:59:09 +0500 Subject: [PATCH] insert slide 14 "What the model actually sees" with token IDs visualization MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New slide goes one level deeper than slide 13 — shows the integer token IDs the model actually receives, not just the colored subword tiles. Uses llm-animation-tokenids.svg with figure max-width 960px (vs 860px for the narrower LLM SVGs) since the asset's viewBox is 1360×600. Caption translates the math notation deliberately left in the SVG: - "The model never reads text — it reads a sequence of integers, each one an index into a vocabulary of ~200,000 entries." - "Notice the comma is always ID 11 — the same punctuation mark maps to the same integer, everywhere, every time." Renumbering: former slides 14–53 shifted to 15–54. Total slide count 53 → 54. data-level distribution preserved. Co-Authored-By: Claude --- .../claude-code-best-practice/index.html | 131 +++++++++++------- 1 file changed, 78 insertions(+), 53 deletions(-) diff --git a/presentation/claude-code-best-practice/index.html b/presentation/claude-code-best-practice/index.html index 2ba94ea..2e3a9c6 100644 --- a/presentation/claude-code-best-practice/index.html +++ b/presentation/claude-code-best-practice/index.html @@ -635,9 +635,34 @@ - +
+
+ + +

What the model actually sees

+ + +
+ Animated diagram showing the 32 integer token IDs the model receives: e.g. 28133 for 'Does', 17554 for ' Chat', 162016 for 'GPT', 97481 for ' Claude'. Generated tokens are also shown as IDs. Vocab size V ≈ 200,000. +
+ The model never reads text — it reads a sequence of integers, each one an index into a vocabulary of ~200,000 entries.
+ Notice the comma is always ID 11 — the same punctuation mark maps to the same integer, everywhere, every time. +
+
+ +
+
+ + + + +
@@ -694,9 +719,9 @@
- + -
+

🧠 Models — e.g. Opus, GPT, Gemini

@@ -733,9 +758,9 @@
- + -
+

🧠 Limitations

The raw model has no real-time access — no internet, no files, no clock.

@@ -744,9 +769,9 @@
- + -
+
@@ -909,9 +934,9 @@
- + -
+

⚡ Tool Calling — how the harness reaches the world

- + -
+

💪 Harness — the body around the brain

@@ -977,9 +1002,9 @@
- + -
+

💪 Harness — the body around the brain

@@ -1014,9 +1039,9 @@
- + -
+

🎉 Yayyyyy! Problem solved with harness

The harness reaches out via WebSearch and fetches a real answer from live sources.

@@ -1025,9 +1050,9 @@
- + -
+
?

Really?

@@ -1035,9 +1060,9 @@
- + -
+

💪 Non-determinism — Doesn’t always use its tools

Similar prompt — but this time the model decided not to use the tool.

@@ -1046,9 +1071,9 @@
- + -
+

💪 Non-determinism — Tools can fail

The model first tried one source — it failed (403) — so it fell back to another.

@@ -1057,9 +1082,9 @@
- + -
+

🚨 Problem Statement

  1. @@ -1074,9 +1099,9 @@
- + -
+

Vibe Coding

Andrej Karpathy's Feb 3 2025 tweet coining 'vibe coding' — 'fully give in to the vibes, embrace exponentials, and forget that the code even exists' @@ -1086,9 +1111,9 @@
- + -
+

Vibe Coding vs Agentic Engineering

@@ -1157,7 +1182,7 @@ todoapp/ -
+

👤 Agents

@@ -1215,7 +1240,7 @@ todoapp/
-
+

Create your first agent — /agents

@@ -1269,7 +1294,7 @@ todoapp/
-
+

Demo