7 min read Product Agents

Failed Checks Belong in Your Inbox, Not Your Notifications

When an agent's work fails a check, it needs an owner, the evidence, and a next action. That is inbox work, not another alert to scroll past.

You set an agent loose on a recurring task overnight. In the morning you have eleven notifications. Two are progress updates you do not need. One says a draft is ready. One says something failed. The rest are noise from the same run. You scroll, you skim, and you move on — and the one that mattered is now three swipes up, already forgotten.

This is the quiet failure of running work through agents. The work itself often goes fine. What breaks is the moment something does not pass — a check that did not clear, an output that needs a human eye, a step the agent could not finish. That moment arrives as a notification, gets the same half-second of attention as everything else, and then disappears into the stream.

A notification tells you something happened. It does not tell you who has to do something about it, what exactly went wrong, or what the next move is. So the important failures get the same treatment as the trivial updates: a glance, then nothing.

The fix is not better notifications. It is to stop treating a failed check as a notification at all.

A notification is an event. A failed check is a piece of work.

When work passes through a person or an agent, there are really two kinds of things the system can produce.

The first is an event: a fact about what happened. The run started. The draft is ready. The export finished. Events are fine as notifications. You read them when you want to, and nothing breaks if you do not.

The second is a decision that someone now has to make. A check failed. An output needs approval before it goes out. The agent got stuck and needs context only you have. These are not facts to be aware of — they are work to be done. And work needs three things a notification never carries.

  • An owner. Someone is responsible for resolving it, by name, not "whoever happens to see it."
  • The evidence. What was being checked, what the output was, and why it did not pass — attached, not buried in a log somewhere.
  • A next action. A clear set of choices: approve, retry, answer, escalate, or stop. And what happens after you pick one.

A notification has none of these. It is a tap on the shoulder with no follow-through. The moment a failed check becomes one notification among many, you have downgraded a piece of work into a piece of trivia.

What a failed check actually needs

Put the two side by side and the gap is obvious.

Notification Inbox item
Has an owner No — broadcast to everyone or no one Yes — assigned to a person
Carries the evidence No — usually a one-line summary Yes — the output and the failing check attached
Tells you the next action No — you have to go figure it out Yes — approve, retry, answer, escalate, stop
Can be resolved in place No — you leave to act elsewhere Yes — you decide and the work moves
Leaves a durable record No — it scrolls away Yes — the decision is kept with the work

A notification is something you acknowledge. An inbox item is something you finish. That difference is the whole point. When a check fails, the agent has reached a boundary it cannot cross on its own. That is precisely the moment a human should be pulled in — and pulled in with everything needed to act, not just told that a wall exists.

Not every failure is the same kind of work

"Failed check" is a blunt label. In practice, agent work stalls for a handful of distinct reasons, and each one needs a different next action. Lumping them all into one alert is part of why the stream feels like noise — everything looks identical, so nothing feels urgent.

A small taxonomy helps. These are the common ways recurring agent work comes back to a human, what each one means, and what it should turn into.

Failure type What it means What it should become
Check failed A verifier — a rule, a test, a quality gate — did not pass An item asking: retry, fix, or accept with a reason
Needs approval The output is finished but should not move forward without a sign-off An item with the output attached and one explicit approve or reject
Agent blocked The agent could not finish — a tool, a credential, or access was missing An item naming the exact blocker, routed to whoever can clear it
Missing context The agent hit a business judgment call it cannot make An item with the specific question, not a transcript to read through

The detail that matters here is the difference between a vague "something went wrong" and a specific "the publish step failed the link check on these two URLs — retry or accept." The first makes you go investigate. The second lets you decide in seconds. One is a notification wearing a costume. The other is real work, ready to be done.

The honest limit: not every failure deserves a queue

This can be taken too far. Turning every minor hiccup into a tracked item with an owner and a history is its own kind of noise — just slower and heavier.

A lightweight notification is genuinely enough when:

  • the event is informational and needs no decision — a run started, a backup completed
  • the failure is transient and the system already retried and recovered on its own
  • the work is one-off and low stakes, where a missed item costs nothing
  • you are the only person involved and you will see the result anyway

The test is simple: does someone have to decide something, and does it matter if that decision is dropped? If yes, it is inbox work. If no, a notification is fine, and forcing it into a queue just makes the queue less trustworthy. An inbox earns its place by staying scarce. The fastest way to ruin one is to route everything into it — at which point it becomes the notification stream you were trying to escape.

So the goal is not to capture more. It is to capture the right things, with enough attached that they can actually be resolved.

Where Task Machine fits

This is the line Task Machine draws. Its inbox is not a feed of everything an agent did — it is where the decisions land. Approvals, questions, failed checks, and exceptions arrive as items you can act on, each with an owner, the evidence from the run attached, and a clear next action.

When a verifier fails on a workflow run, it does not vanish into a log or fire off an alert that scrolls away. It becomes an item: here is what was checked, here is what the agent produced, here is why it did not pass, and here are your choices. You decide in place. The decision stays with the work, so the next time you or a teammate looks back, the record is intact — what failed, who handled it, and what they chose.

That is the difference between watching agents and operating them. Watching means staring at a stream and hoping you catch the one that counts. Operating means the work comes to you only when it needs you, and arrives ready to be finished.

For a solo builder running agents on recurring tasks, that is the difference between a morning spent reconstructing what happened overnight and a morning spent making a handful of clear calls. The agents did the work. The inbox tells you the few moments where your judgment was actually required.

Stop scrolling. Start resolving.

If your agent work today lives in a notification stream you have learned to ignore, the problem is not your attention. It is that failed checks were never given the shape of work in the first place — no owner, no evidence, no next action.

Task Machine is building that inbox for recurring work done by humans and agents. If a stream of alerts you have stopped reading is the friction you feel, join the private beta on the waitlist.