Practical patterns for
Claude Code
Software Architect at 
claude-code-best-practice
Fetch weather of Karachi
By solving this problem statement, we’ll learn different concepts of agentic engineering along the way.
I'll unpack each of these as we go — for now, just let them wash over you.
progressive disclosure — feeding the AI only what it needs right now, not everything upfront
orchestration — coordinating several AI agents like a conductor leads a band
dumb zone — the stretch where AI has too much context and starts thinking worse, not better
agentic workflows — AI that plans, acts, checks its work, and adapts — multi-step on its own
harness — the scaffolding around the model — files, terminal, tools — that turns a chatbot into a worker
compaction — auto-summarizing old chat so the AI keeps going without hitting its memory ceiling
context window — the AI’s working memory — how much it can “see” at once before older details fall off
ralph wiggum loop — when the AI repeats the same broken step in circles, like a confused kid who can’t stop
MCP — a universal adapter letting AI talk to your tools (GitHub, Slack, databases)
hooks — auto-triggers that run your rules before or after the AI does anything
context rot — quality decay as the conversation drags on and earlier details blur
prompt engineering — the craft of phrasing requests so the AI understands exactly what you mean
AI slop — low-effort, generic AI output that looks polished but says nothing
inference — the moment the model actually runs to produce an answer. Training is when the model learned (once, long ago). Inference is the model answering you, right now
hallucination — when AI confidently makes up facts that sound true but aren’t
context bloat — overstuffing the AI’s memory so it slows down and loses focus
one-shot prompting — giving the AI one example and asking it to follow the same pattern
token burn — wasting expensive AI “words” on unnecessary back-and-forth or bloated prompts
vibe coding — describing what you want in plain English and hoping the AI nails it
agentic engineering — building guardrails so AI acts like a reliable teammate, not a gamble
discuss
Model (Brain 🧠 — e.g. Opus, GPT) + Harness (Body 💪 — e.g. tools, MCP, memory)
Stochastic means random/probabilistic — models don’t know the answer, they sample from a probability distribution.
Prompt: “The sky is ___”
Each run the model samples — temperature controls how widely it samples.
Bender, Gebru, McMillan-Major, Mitchell — On the Dangers of Stochastic Parrots (2021)
Every turn is a fresh API call.
Memory only exists if the harness replays the transcript.
“What is the capital of Japan?”
“When did Pakistan gain independence?”
“Who wrote Romeo and Juliet?”
“Who won yesterday’s match?”
“What’s today’s USD → PKR rate?”
“What did Anthropic release yesterday?”
Every model — no matter how new — has a knowledge cut-off. Events after that date simply do not exist inside the model.
Knowledge cut-off: January 2026
Released 2026-04-17
Knowledge cut-off: December 1, 2025
Released 2026-04-23 — brand-new, but still has a cut-off.
Knowledge cut-off: January 2025
Released 2026-02-19
The raw model has no real-time access — no internet, no files, no clock.
A horse harness. A model harness.
The model is the horse. Raw power, no direction. The harness is everything else.
The origin is Old French harneis — gear, equipment, armor.
In the diagram above: Turn × 1 · Inference × 2
Turn — one round from the user’s view: you ask, the assistant answers. The entire flow above — your request, the assistant’s tool calls, and the final reply — is one turn.
Inference — one call to the language model. The model wakes up, reads the input it was given, writes a reply, then forgets everything. Every arrow touching the “Language Model” column above is a separate inference. One turn can contain many inferences.
A dedicated Claude worker — own context, tools, focus.
✅ fresh working memory per run
What the specialist (or Claude) can actually do.
✅ progressive disclosure — loaded on demand
Repeatable step-by-step recipes — like an AC install guide.
✅ reproducible recipes
Knowledge you provide to the model.
⚠️ 200-line problem
What Claude holds in his head now — fresh every new chat session.
⚠️ dumb-zone problem
Built-in: Read, Edit, Bash, WebSearch.
USB-C for AI — plug in external tools (databases, browsers, APIs).
e.g. 👁️ Claude in Chrome
Allow / ask / deny for tool use.
Deterministic scripts that fire on events.
Knowledge Anthropic bakes in.
e.g. identity · tone · safety
✅ always on
The harness reaches out via WebSearch and fetches a real answer from live sources.
Really?
Similar prompt — but this time the model decided not to use the tool.
The model first tried one source — it failed (403) — so it fell back to another.
The model does not always follow the same path.
Model sometimes failed/forgot to use the tool.
Andrej Karpathy — OpenAI founding team · former Director of AI at Tesla · founder of Eureka Labs.
Examples: weather reporter, front-end engineer, QA engineer.
root/ ├── .claude/ │ └── agents/ │ └── weather-agent.md └── README.md
/agentsType /agents.
Opens an interactive menu — pick "Create new agent" and the CLI drafts the agent file for you.
Creates .claude/agents/<name>.md — a plain markdown file anyone can edit.
Not so fast...
The agent will always call the Open-Meteo API — no more source drift.
It’s not guaranteed that Claude will always call this agent.
weather-agent
harness
"PROACTIVELY" for auto-invocation
prompt
haiku, sonnet, opus, or inherit (default). weather-agent uses sonnet
harness
WebFetch, Read, Write, etc.
harness
weather-fetcher
harness
5
harness
user, project, or local. weather-agent uses project
harness
green
harness
The skills: field is what makes the agent special. It preloads any-skill directly into the agent’s brain at startup — before the agent has received a single instruction.
claude-code-best-practice Tips & Tricks
claude-code-best-practice Tips & Tricks
Knowledge you provide to the model — read every session.
root/ ├── CLAUDE.md └── README.md
/initType /init in Claude Code.
Claude scans your codebase and drafts a starter CLAUDE.md for you — project conventions, key patterns, common commands.
Creates CLAUDE.md in your repo root — a plain markdown file you edit and commit.
claude-code-best-practice Tips & Tricks
Examples: weather fetching, sorting CSV rows, generating SVG cards.
root/ ├── .claude/ │ └── skills/ │ └── weather-fetcher/ │ └── SKILL.md └── README.md
Just describe what you want Claude to do. It drafts the skill file for you.
"Create a skill that fetches weather from Open-Meteo for a given city."
Anthropic ships an official skill-creator skill. Invoke it and it walks you through generating a properly-structured skill.
Recommended — always produces the correct SKILL.md format.
⚠️ Watch out (method 1)
The prompting method sometimes creates the wrong structure. Instead of generating a folder with SKILL.md inside (e.g. weather-fetcher/SKILL.md), it creates a plain weather-fetcher.md file. The wrong form isn’t recognized as a skill by Claude Code.
Most fields control how and when the skill loads — enforced by the harness. Only description lives in prompt-land.
/slash-command (defaults to directory name)
harness
/ menu (e.g. [city-name])
harness
true to prevent Claude from invoking this skill automatically
harness
false to hide from / menu — background knowledge only
harness
haiku, sonnet, opus
harness
fork to run skill in an isolated subagent context
harness
Small description, full body loaded on demand — this is progressive disclosure. Claude’s main context stays lean until the skill is actually needed.
The model's working memory — what it can see in this moment.
/compact
Imagine Claude has a brain that holds everything it's aware of right now — your question, every file it's opened, every tool result, every word it's said back to you. If a thought isn't in the brain, Claude can't use it. Simple as that.
1. The brain is finite. It can hold about 1 million tokens — roughly 750,000 words. Big, but not infinite. 2. The brain empties at the end of every chat. When you start a new conversation, Claude remembers nothing from the last one unless you tell it again.
The moment you open Claude Code, certain things land in Claude's brain before you've typed a word. The rest waits in the wings — only loaded when you actually need it. This is called progressive disclosure.
Only descriptions of skills and agents are loaded at startup — the rest is fetched on-demand. That's progressive disclosure. It keeps the brain light.
by Nelson F. Liu · Stanford University · 2023
CLAUDE.md) or near the user's most recent message. A bigger context window doesn't help if your payload lands in the middle.
This is the "dumb-zone problem" the deck has been warning about — now you know where it came from.
Repeatable step-by-step recipes — the instruction manual that makes Claude run the same playbook every time.