Human-in-the-Loop

Autonomy isn't a switch. It's a dial, and the whole craft is knowing where to set it for a given action.

Most of the discourse treats it as binary — either the agent runs free or a human babysits every keystroke. Both extremes are useless. Full autonomy on an action that emails your customers is reckless. A human approving every database read is just a slow human with extra steps. The real design question is narrower and more interesting: which specific moments deserve a person, and how do you insert one without turning the agent back into a form.

The asymmetry that decides everything

Here's the lens I use. Sort every action the agent can take by two things: how reversible it is, and how costly it is when wrong.

A web search? Cheap, reversible, do it a thousand times unsupervised. Reading a file? Same. But sending a refund, deleting a record, posting to a customer, executing a trade, merging to main — these are expensive and hard to undo. That quadrant — high cost, low reversibility — is exactly where a human belongs, and nowhere else really needs one.

This reframes the whole thing. You're not deciding "how autonomous should this agent be." You're tagging individual actions, and the irreversible expensive ones get a gate. Anthropic's "building effective agents" guidance lands in the same place from a different angle: keep humans at the checkpoints that matter and stay out of the agent's way everywhere else. The art is being stingy with the gates.

Three ways to put a human in

There isn't one human-in-the-loop pattern. There are at least three, and they interrupt the agent at different points.

Approve / reject (the gate). The agent proposes an action and pauses before taking it. A human says yes or no. Classic for irreversible operations — the refund waits for a click. This is the one people mean by default.

Edit (the steer). The human doesn't just approve; they modify the agent's proposed action or its draft output before it proceeds. The agent wrote the email; you tweak the second paragraph and then it sends. More effort than a yes/no, far more control.

Review-after (the audit). The agent acts autonomously but flags the action for later human review. No blocking. You accept that some actions might be wrong, in exchange for speed, and you catch the mistakes after the fact. Right for reversible-but-worth-watching actions, wrong for anything you can't take back.

A risk gate routing actions to a human — Reversible, low-cost actions run; irreversible ones pause for a human to approve, edit, or reject.

The hard part is the pause

Conceptually a gate is easy. Mechanically, pausing an agent mid-run and resuming it possibly hours later is the engineering problem, because the agent might be ten steps deep with a stack of state, and the human won't get to it until after lunch. You can't just block a thread for three hours.

This is where durable execution earns its keep. LangGraph's interrupt does exactly this: the agent hits the interrupt, its full state is checkpointed to durable storage, the process is free to exit, and when the human finally responds, execution resumes from precisely that point with all the context intact.

from langgraph.types import interrupt, Command

def approval_gate(state):
    # Execution stops here; state is persisted. Could resume hours later.
    decision = interrupt({
        "action": "issue_refund",
        "amount": state["amount"],
        "customer": state["customer_id"],
    })
    if decision["type"] == "approve":
        return issue_refund(state)
    if decision["type"] == "edit":
        return issue_refund({**state, **decision["changes"]})
    return Command(goto="replan")        # rejected

The key property: the agent is not sitting in memory burning resources while it waits. It's checkpointed and dormant. Resuming is Command(resume=decision), and the agent picks up like nothing happened. Without durable state, "human-in-the-loop" quietly means "a process pinned open until someone clicks," which falls over the moment your reviewer goes home.

The question this raises, and that most tutorials skip: what happens if the human never responds? A pending approval that hangs forever is its own failure mode — a refund stuck in limbo, a customer waiting on a reply that's blocked on a reviewer who's on vacation. So a gate needs a timeout policy as much as it needs the gate itself, and the right default depends entirely on direction of risk. For an irreversible action, time out to reject — when in doubt, don't do the dangerous thing. For a low-stakes one where blocking is the bigger harm, you might time out to proceed. Either way the choice has to be deliberate, because "wait indefinitely" is not a policy, it's an outage waiting for a slow afternoon.

The failure mode is asking too much

Now the counterintuitive part, the one teams learn the hard way. The biggest risk in a human-in-the-loop system isn't the agent doing something dumb. It's interrupt fatigue.

Gate too many actions and you train your reviewers to rubber-stamp. A human who's clicked "approve" 200 times today is not reviewing the 201st — they're pattern-matching to "looks normal" and clicking. At that point your safety mechanism is theater. The human is nominally in the loop and functionally asleep, and the one genuinely dangerous action sails through on muscle memory. This is the well-documented automation-complacency problem the NIST AI risk framework warns about, and it bites HITL systems specifically.

Which means fewer gates make the system safer, not less safe. Every gate you remove from a low-stakes action is attention you've preserved for a high-stakes one. A reviewer who approves five carefully-chosen actions a day is sharp. One who approves five hundred is a liability with a mouse.

Where this is heading

The trajectory is clear and a little uncomfortable: agents get more autonomous, so the human role shifts from operator to exception handler. You stop driving and start handling the cases the agent flags as outside its confidence — which means the gates themselves should get smarter, triggered by the agent's own uncertainty rather than a static list. Let the agent escalate when it's unsure, run free when it isn't, and your scarce human attention flows to exactly the decisions that need it.

That's the version worth building toward. Not a human watching everything, and not an agent trusted with everything — a system that knows the difference between a search and a refund, and only spends a person on the second one.

The asymmetry that decides everything

Three ways to put a human in

The hard part is the pause

The failure mode is asking too much

Where this is heading

Leave a Reply