7 min read Agents Product

Keep Your Agents From Quietly Burning Through Your Budget

Autonomous agents can run up real costs through retry loops and runaway work. Set budgets, retry limits, and stop conditions before the spend, not after.

You handed an agent a job and let it run. The work came back done, more or less. Then you checked the bill. The agent had been busy in ways you never asked for — repeating the same expensive step, calling tools you did not expect, and pushing forward long after a person would have stopped to ask a question.

Nothing crashed. There was no error to point at. The agent simply kept working, and every minute of that work cost money. By the time anyone noticed, the spend had already happened.

This is the quiet version of an agent going wrong. It does not look like a failure. It looks like progress, right up until the moment you see the number.

Why agents overspend without anyone deciding to

A person doing recurring work has a natural sense of "enough." They notice when a task is taking too long, when they are repeating themselves, or when a small job has somehow become an all-day job. They pause and ask whether it is still worth it.

An agent has none of that instinct unless you give it one. Left alone, an agent treats "keep going until the task is finished" as a literal instruction. If finishing is hard — or impossible — it will keep trying. Each retry, each extra tool call, each additional pass costs something. The cost is invisible while it happens and obvious only afterward.

The core problem is that most agent setups have no built-in answer to a simple question: how much is this allowed to cost. Without that answer, spending is not a decision anyone made. It is a side effect of letting work run unattended.

Cost control is a trust feature, not an accounting chore

It is tempting to file this under finance — something to reconcile at the end of the month. That framing is too small. Unpredictable cost is the main reason people are afraid to let agents run on their own.

If you cannot predict what a job will cost, you cannot leave it alone. You hover. You babysit overnight runs. You keep agents on short, supervised tasks because the long, valuable ones feel too risky. The fear of a surprise bill quietly caps how much real work you are willing to hand over.

So cost control is really a reliability feature. A job with a known ceiling is a job you can walk away from. The goal is not to spend less for its own sake — it is to make spending a thing you decide on purpose, so that letting agents work alone stops feeling like a gamble.

The common ways agents run up a bill

Most cost surprises are not exotic. They come from a short list of repeatable patterns. Naming them is the first step, because each one has a specific control that stops it.

Failure mode What it looks like What causes it Control that stops it
Retry loop The same expensive step runs over and over A step keeps failing and the agent keeps re-attempting it with no limit Retry limit
Runaway loop Work continues with no natural end The task has no clear "done," so the agent never stops Stop condition or verifier gate
Over-eager autonomy The agent keeps going where a person would have paused to ask No defined point at which uncertainty should hand back to a human Approval before spend
Expensive tool calls A few costly actions quietly dominate the bill The agent uses high-cost tools or models more often than expected Budget cap
Silent escalation of spend A small job grows into a large one without notice No running total and no threshold that triggers a check-in Budget cap plus approval before spend

Read down the list and a pattern appears. The expensive cases are almost always work that should have stopped earlier — to ask, to check, or to give up — but had no mechanism to do so.

Four controls that turn spend into a decision

You do not need a complex system to fix this. You need four explicit limits, set before the work starts.

  • Budget cap. A hard ceiling on what a job may cost. When the job reaches it, work pauses instead of pushing past it.
  • Retry limit. A maximum number of attempts for any step. After that, the job stops re-running the expensive thing and surfaces the problem instead.
  • Stop condition. A clear definition of "done," so the agent has a finish line rather than working until something external interrupts it.
  • Approval before spend. A point where a person sees the cost decision and says yes or no — before the money is spent, not after.

The last one is the most important and the most often missing. Most tooling tells you what an agent spent. Far fewer tools stop to ask before the spend happens. By the time a notification says "this run cost more than usual," the cost is already gone. The decision should come first.

The honest tradeoff

Tight limits have a real cost of their own. A budget cap set too low will stop legitimate work that simply needed one more pass. A strict retry limit can give up on a step that would have succeeded on the next attempt. An aggressive approval gate can interrupt you for spending you would have happily allowed.

We are not claiming you can remove all risk without removing autonomy. You cannot. A genuinely autonomous job will sometimes need to spend more than you guessed.

The point is not to eliminate every overage. The point is to make the limit explicit and the overage a decision. A job that pauses at its budget and asks whether to continue is in a far better state than a job that quietly blew past a number nobody set. You can always approve more. What you cannot do is un-spend money on work you never agreed to.

Where Task Machine fits

This is the model Task Machine is built around. Recurring work done by humans and agents needs limits that travel with the work, not limits you remember to check afterward.

In Task Machine, a job can carry a budget and a retry limit as part of how it runs. When a job approaches its ceiling, or needs to spend in a way that crosses a threshold, that decision becomes an item in your inbox — a question routed to a person before the money moves, not a receipt delivered after. You approve, adjust the limit, or stop the job.

When the work is finished, the run history shows what actually happened: what was attempted, where it retried, and what it cost. That record is what lets you tune the next budget instead of guessing again. The inbox keeps the in-flight decisions in front of a human; the history keeps the past honest.

The shift is small but it changes how it feels to let agents work. Spending stops being something that happens to you overnight and becomes something you decide — sometimes in advance, sometimes in the moment, but always on purpose.

Let agents work overnight without the surprise

The reason people keep agents on a short leash is rarely the quality of the work. It is the fear of what an unattended run might cost. Remove that fear and longer, more valuable jobs become possible.

Budgets, retry limits, stop conditions, and approvals before spend are how you remove it. They turn an open-ended run into a bounded one, and they turn a surprise bill into a decision you got to make.

If you want to let your agents do real work overnight without checking the bill in the morning with one eye closed, join the private beta on the waitlist.