Loop Engineering · Lesson 3 · Lever 02 — Context
🌐 हिंग्लिश version →

The Context Lever

Lever 02: every turn, you assemble the exact tokens the model sees. Curating that payload is the single biggest factor in how the agent performs.

Step 1 of the six-move turn was “assemble context.” It sounds like plumbing. It isn’t — Anthropic calls context engineering “the natural progression of prompt engineering” and the dominant lever on agent reliability.1 Prompt engineering tunes one message; context engineering governs the entire state — system instructions, tools, external data, and the whole message history — and re-decides it on every single turn.1

The one principle Find “the smallest possible set of high-signal tokens that maximize the likelihood of some desired outcome.” Models have a finite attention budget; pile on tokens and you get context rot — accuracy decays as the window fills.1 Treat context as a precious, finite resource, not a bucket.

That principle has a non-obvious consequence: more context is not safer. The instinct to “just include everything in case it’s needed” is the main thing this lever fights. Below are the four places you spend the budget — and here they are in one function:

build_context — every turn, the four places you spend the budget
def build_context(state):
    return [
        system_prompt,                     # 1 · right altitude — not brittle, not vague
        *tool_defs(minimal=True),          # 2 · few, non-overlapping tools
        *retrieve_just_in_time(state),     # 3 · pull data on demand, not up front
        *compact(state.history),           # 4 · summarise; drop redundant tool output
    ]                                      # goal: the smallest high-signal set

1 · System prompt — find the right altitude

The failure mode at one extreme is hardcoding brittle if-this-then-that logic; at the other, vague guidance that gives the model no concrete signal.1 Aim for the “right altitude”: specific enough to guide behaviour, flexible enough to leave the model strong heuristics. Organise it with clear sections (XML tags or Markdown headers) so each region of guidance is legible.1

2 · Tools — minimal and non-overlapping

Tools are context too: their definitions sit in the window every turn. Keep the set small, self-contained, and unambiguous, returning token-efficient results.1 The litmus test is sharp:

If a human engineer can’t definitively say which tool should be used in a given situation, an AI agent can’t be expected to do better. — Anthropic, Effective Context Engineering

This is the ACI (Lever 03) from Lesson 1, seen through the context budget: bloated, overlapping tool sets create ambiguous decision points and burn tokens. Hold the idea that tool definitions are spent budget — here it is concretely:

Ambiguous vs. clear tool surface — which could you defend?
# ✗ overlapping — neither you nor the agent can pick confidently
search(q) · find(q) · lookup(q) · query_db(q)

# ✓ one obvious tool per job; returns ids, not whole files
search_docs(q) -> [{path, snippet}]

3 · Retrieval — just-in-time, not pre-loaded

Don’t stuff all the data up front. Let the agent hold lightweight identifiers — file paths, queries, links — and pull the actual content with tools when it needs it.1 This mirrors human cognition: we don’t memorise the corpus, we keep an index and look things up. The agent assembles understanding layer by layer, keeping only what’s necessary in working memory. The trade-off is honest — runtime exploration is slower than pre-computed data — so a hybrid (load a little up front, explore the rest) is often best.1

4 · Long horizons — three ways to beat the window

When a task outruns a single context window, you don’t get a bigger bucket — you manage the one you have. Three techniques, each suited to a different shape of work:1

Long-horizon techniques
TechniqueWhat it doesBest when
CompactionSummarise the conversation near the limit; restart the window from the summary. Keep decisions & open bugs; drop redundant tool output.Long back-and-forth that must keep flowing
Structured notesAgent writes notes to memory outside the window, pulls them back when relevant. Persistent memory, minimal overhead.Iterative work with clear milestones
Sub-agentsA clean-context sub-agent does the deep work, returns only a ~1–2k-token distilled summary.Parallel research; isolating detail

Notice the through-line: all three protect the same scarce resource by keeping high-signal tokens in, and pushing raw detail out to where it can be recalled on demand.

Your win today

You can now audit any turn’s context against one principle — smallest high-signal set — and locate waste in four places: an off-altitude system prompt, overlapping tools, eager pre-loading, and history you should have compacted. That audit is the context lever.

Recall check

Retrieve from memory. (One question reaches back to Lesson 2 — that’s deliberate.)

Primary source — read this next

Anthropic — Effective Context Engineering for AI Agents. The definitive treatment of this lever: attention budget, right-altitude prompts, just-in-time retrieval, and the three long-horizon techniques. ~20 minutes, and it pairs directly with this lesson.

I’m your teacher — use me. Want to audit a real turn from one of your agents? Paste the assembled context and I’ll point to the lowest-signal tokens and which of the four spots is leaking budget. Curious how compaction vs. sub-agents would play out in your TradingAgents loop? Ask away.
Lesson 2: The Dial 📖 Glossary Next → Lesson 4: Stop conditions & budgets

Sources

  1. Anthropic — Effective Context Engineering for AI Agents. Smallest high-signal set; attention budget & context rot; right-altitude system prompts; minimal non-overlapping tools; just-in-time retrieval; compaction, structured note-taking, sub-agents.
  2. HumanLayer — 12-Factor Agents. Factor 3: own your context window.