Animated SVG showing the same BPE-tokenized prompt from tokens.jpg (32 colored
subword tiles, e.g., "Anthropic" → "Anth"+"ropic", "Perplexity" →
"Per"+"plex"+"ity") feeding into the LLM and generating "Yes, they all use
BPE." token-by-token across 7 iterations. Combines tokenization and
autoregressive generation into one view.
Co-Authored-By: Claude <noreply@anthropic.com>
Screenshot of platform.openai.com/tokenizer showing the sentence "Does ChatGPT,
Claude, Anthropic, Llama, Mistral, Gemini, and Perplexity all use Byte-Pair
Encoding (BPE)?" tokenizing to 32 tokens / 105 characters. Visible tabs:
GPT-5.x & O1/3, GPT-4 & GPT-3.5 (legacy), GPT-3 (legacy) — illustrates that
different model generations use different tokenizers.
Co-Authored-By: Claude <noreply@anthropic.com>
Three-panel SVG (input context, LLM black box, predicted next token) with
7-iteration loop generating "The capital of Japan is Tokyo." from the prompt
"What is the capital of Japan?". Includes purple feedback loop showing each
predicted token appended back into the input.
Co-Authored-By: Claude <noreply@anthropic.com>