Agents: Advanced·Nov 4, 2025·5 minagents llm

Goal Setting and Monitoring

"Be helpful." Try writing a test for that.

You can't, and that's the problem in one phrase. A goal an agent can't check itself against is a vibe, and an agent running on a vibe will wander off, declare success, and hand you something plausible that isn't what you wanted. The gap between the goal you said and the goal it can verify is where agents quietly fail — not with an error, with a confident wrong answer.

So this post is about two unglamorous skills: writing goals an agent can actually measure, and watching whether it's still on the one you gave it.

A goal is a thing you can fail

Take a vague objective and the question to ask is brutal: how would the agent know it succeeded? "Summarize this document" — know it how? "Resolve the support ticket" — resolved by whose standard? If the agent can't answer, neither can it self-correct, because there's nothing to correct against.

The old project-management SMART checklist is corny and it's also right: specific, measurable, achievable, relevant, time-bound. Strip the corporate gloss and the load-bearing word is measurable. "Be helpful" becomes "answer the user's question using only the provided document, cite the paragraph, and say 'not in the document' if it isn't there." That second version is testable. You can write an assertion against it. The agent can write an assertion against it, which is the part that matters — a measurable goal is one the agent can grade its own output by before it ever returns.

This is why the ReAct pattern holds up. By forcing the agent to interleave reasoning and acting — thought, action, observation, thought — it keeps the goal in view across steps instead of letting it evaporate after the first one. The reasoning trace is the agent restating, each loop, what it's still trying to do.

Subgoals, and the trap inside them

Real goals decompose. "Book me a trip to Lisbon" becomes flights, then a hotel, then ground transport. Decomposition is good — but it plants a specific failure: the agent nails every subgoal and misses the actual point. It books the flight (✓), books the hotel (✓), books the car (✓), and the hotel is forty minutes from the meeting the whole trip was for.

Each subgoal passed. The real goal failed. This is goal drift through decomposition, and it's nasty precisely because every local check is green. The defense is to keep the top-level objective present at every step, not just the current leaf — and to validate against the original intent at the end, not only against the subtasks you happened to invent. A subgoal checklist is a map of where you've been, not proof you arrived.

A goal, plan, execute, verify loop
Every loop re-checks the signal — and the final step validates against the original intent.

That final V diamond — meets original intent? — is the one people drop. Without it the loop happily terminates the moment the subgoal list is exhausted, whether or not the trip is any good.

Monitoring is not logging

Here's where teams fool themselves. They add logging, see lines scrolling past, and feel observed. But logs tell you what happened; monitoring tells you whether it's going wrong. Different jobs.

Concretely, for an agent you want signals at three altitudes:

You instrument this the boring way: structured traces. A tool like LangSmith captures the full step tree of an agent run, and the OpenTelemetry GenAI semantic conventions give you a vendor-neutral schema for spans — model, tokens, latency, tool calls — so your agent monitoring lives in the same dashboards as the rest of your stack instead of a bespoke island.

# Cheap per-run guards that catch the common pathologies.
def check_run(trace):
    if trace.steps > STEP_BUDGET:
        alert("runaway", trace.id)            # not converging
    if repeats(trace.actions, n=3):
        alert("stuck loop", trace.id)         # same action thrice
    if not meets_spec(trace.final_output):
        alert("goal miss", trace.id)          # passed subgoals, failed intent

The signal that matters most is the one that moves

A single run's metrics are nearly useless in isolation. 92% success means nothing until you know it was 97% last week. The whole value of monitoring is the derivative — the trend, the drift, the slow slide that no individual run reveals. Your prompts didn't change, but the model provider pushed an update, or your users started asking different questions, or a tool's API shifted under you. The agent degrades silently, and the only thing that catches it is watching the rate over time against a goal you defined sharply enough to score.

Which loops back to the start. You cannot monitor drift away from a goal you never made measurable. "Be helpful" has no derivative. "Answers from the document, cites the paragraph, 92% of the time" does — and the day it drops to 85%, you'll know, instead of finding out from an angry user three weeks later.

So before you build the dashboard, ask the harder question: can the agent fail this goal in a way you'd both agree on? If not, you don't have a goal yet. You have a hope.

Leave a Reply

Your email address will not be published.