Planning

You hand a planning agent the what — "organize the team offsite, budget's $8k, twelve people, sometime in March" — and it works out the how. Find venues, check dates against the budget, hold a room, send invites. Nobody wrote that sequence down in advance. The agent discovered it from the goal.

That's the promise, and it's a real capability. It's also the pattern most likely to be overkill, and I want to make the case for both halves of that sentence before we're done.

A plan is a guess about the future

Planning is the agent charting a course from where it is to where it wants to be: model the starting state, model the goal, and find a sequence of actions that connects them. Two properties make it planning and not just a script. First, the plan is produced in response to the request — not hardcoded, not known ahead of time. Second, and this is the one people underbuild, the plan is a starting point, not scripture. When the venue you were holding falls through, a good planning agent registers the new constraint, throws out the dead branch, and forms a new plan. A planner that can't re-plan is just a chain that hasn't failed yet.

There are two dominant shapes, and they trade off the same way every time:

ReAct interleaves reasoning and acting. Think a little, take one action, observe the result, think again. The plan emerges step by step and updates constantly. Flexible, reactive — and it consults the big model on every single step, which is slow and pricey.
Plan-and-execute writes the whole plan up front, then runs the steps, then refines what's left. The expensive planning model gets called once at the start and again only when re-planning; the individual steps can run on cheaper models or plain code. Faster, cheaper, and easier to inspect — at the cost of a plan that's staler between revisions.

The ReAct paper is the canonical interleaved approach; Plan-and-Solve showed even a zero-shot "devise a plan, then carry it out" prompt beats plain chain-of-thought on multi-step problems. If you want the full taxonomy — decomposition, plan selection, reflection, memory — the 2024 planning survey is the map.

A goal expands into a plan of ordered steps, then an on-track check either reaches the goal or loops back to re-plan — The back-edge is the point — a planner that cannot re-plan when reality diverges is just a fragile chain.

Re-planning is the whole point

Notice the back-edge in that diagram — the loop from "on track?" back to "plan." That edge is where planning earns its existence. Without it you don't have a planner, you have an LLM that wrote a to-do list and is now grimly executing it even as reality diverges from the assumptions it made at step zero.

A plan-and-execute loop is short:

def plan_and_execute(goal):
    plan = make_plan(goal)               # LLM writes ordered steps
    done = []
    while plan:
        step = plan.pop(0)
        result = execute(step)           # tools, sub-agents, code
        done.append((step, result))
        plan = replan(goal, done, plan)   # refine the rest from what just happened
    return synthesize(goal, done)

The replan call is doing the heavy lifting. After each step it looks at what actually happened — the venue was booked, the API returned an error, the budget came in tight — and rewrites the remaining steps to match. LangChain's plan-and-execute writeup builds exactly this as a graph: planner → executor → replanner refining from past_steps. The executor steps are cheap; the planner is consulted sparingly. You get most of ReAct's adaptivity without paying the big model on every tick.

And if your steps are independent, planning and parallelization stack nicely. LLMCompiler plans a task into a DAG of tool calls and fires the independent ones concurrently, reporting big latency and cost wins over step-by-step execution. The plan isn't just an ordered list — it's a dependency graph, and dependency graphs have parallel slack.

For the harder problems, you don't even commit to one plan. Tree of Thoughts explores several plan branches with lookahead and backtracking — when a branch hits a dead end, it abandons it and tries another, instead of marching a single doomed line to the end. That's planning with an undo button.

The expensive question: should you plan at all?

Here's the part the demos skip. The book's sharpest caution about planning isn't about how to do it — it's about when not to.

When the path is already well understood and repeatable, dynamic planning is the wrong tool. A fixed, deterministic sequence — a prompt chain — is more predictable, cheaper, and vastly easier to test. If you already know the five steps to onboard an employee, do not make an agent rediscover them from scratch every Tuesday, burning planning tokens and occasionally hallucinating a sixth step that doesn't exist. You wrote the procedure down for a reason. Use it.

Planning is for when the path is unknown and changes — when you genuinely can't enumerate the steps in advance because they depend on what you find along the way. Research tasks. Open-ended automation. Anything where step three depends on a result you won't have until step two runs. The moment you can draw the flowchart ahead of time, you don't need a planner; you need the flowchart.

The other two cautions are quieter but real. LLM-generated plans can be plain wrong — skip a step, invent an infeasible one, plan at the wrong altitude (so granular it's brittle, so coarse it's useless). The defense is the same back-edge: monitor, verify between steps, re-plan when reality disagrees. A plan you execute without checking is a hallucination with a checklist.

Give an agent a goal and watch it build a path nobody specified — it feels like the most "agentic" thing in this whole set, and for unknown terrain it is. Just keep asking the unglamorous question first: do I actually not know the steps? If you do know them, the boring chain wins, and it isn't close.

A plan is a guess about the future

Re-planning is the whole point

The expensive question: should you plan at all?

Leave a Reply