reformat slide 10 to match slide 17 pattern (heading + separator + content)

Slide 10 ("How an LLM generates text") restructured to use the deck's
canonical heading-with-separator pattern modeled after slide 17:
- <h1> uses default styling (no inline overrides) → gets the default
  border-bottom that acts as the separator
- Outer flex-centering wrapper dropped, matching slide 17's flat layout

Heading text changed from "One token at a time" → "How an LLM generates
text (autoregressive)" — promoted out of the SVG asset (which has been
trimmed to diagram-only in the paired commit).

Caption simplified to a single bold line: "Each predicted token is
appended to the input, then fed back into the LLM." (was previously the
SVG's subtitle). The secondary streaming/cost line is removed for focus.

Figure max-width increased 860px → 1100px (~28% larger) for projector
legibility — combined with the SVG's diagram-shifted-up restructuring,
the on-screen diagram is now roughly 2x the previous size.

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
Shayan Rais
2026-05-07 12:35:03 +05:00
parent 13f7ca9e48
commit 40c302040d
@@ -480,25 +480,17 @@
<!-- SLIDE 10: How LLMs Generate Text — Autoregressive Loop -->
<!-- ============================================================ -->
<div class="slide" data-slide="10">
<div style="display: flex; flex-direction: column; align-items: center; justify-content: center; min-height: calc(100vh - 120px); text-align: center; gap: 0;">
<!-- Title -->
<h1 style="border-bottom: none; padding-bottom: 0; font-size: 2.6rem; margin-bottom: 28px;">One token at a time</h1>
<!-- Animated SVG -->
<figure style="max-width: 860px; width: 100%; margin: 0 auto;">
<img
src="../assets/llm/llm-basic.svg"
alt="Animated diagram showing autoregressive generation: prompt feeds into LLM, which predicts one token, feeds it back, and repeats until the full answer is produced."
style="width: 100%; border-radius: 12px; box-shadow: 0 4px 24px rgba(0,0,0,0.10);"
/>
<figcaption style="margin-top: 20px; font-size: 1rem; color: #555; font-style: italic; line-height: 1.5;">
<strong style="font-style: normal; color: #1a1a1a;">The model produces one token per inference, feeding each result back as new input.</strong><br/>
This is why streaming feels gradual &mdash; and why longer outputs cost more in both latency and API spend.
</figcaption>
</figure>
</div>
<h1>How an LLM generates text (autoregressive)</h1>
<figure style="max-width: 1100px; width: 100%; margin: 24px auto 0;">
<img
src="../assets/llm/llm-basic.svg"
alt="Animated diagram showing autoregressive generation: prompt feeds into LLM, which predicts one token, feeds it back, and repeats until the full answer is produced."
style="width: 100%; border-radius: 12px; box-shadow: 0 4px 24px rgba(0,0,0,0.10);"
/>
<figcaption style="margin-top: 16px; font-size: 1.1rem; color: #1a1a1a; font-weight: 600; text-align: center;">
Each predicted token is appended to the input, then fed back into the LLM.
</figcaption>
</figure>
</div>
<!-- ============================================================ -->