Building Software for Agents

The operator is changing

Every piece of software ever built assumes a human is driving. A person reads the dashboard, clicks the button, interprets the error message, decides what to do next. The entire design vocabulary of software — forms, modals, confirmation dialogs, toast notifications — exists because a human needs to understand what is happening and tell the system what to do.

That assumption is breaking. Not in the future. Now.

AI agents are beginning to operate software. They call APIs, interpret results, make decisions, and take actions. Sometimes a human reviews their work. Sometimes the human is asleep. The trajectory is clear: more software will be operated by agents, and the ratio of autonomous operation to human oversight will increase over time.

This changes how software should be built. Not at the surface level — not just "add an API" — but at the architectural level. The design constraints for agent-operated software are fundamentally different from human-operated software, and most builders have not yet internalized what those differences mean.

The dashboard is dead, long live the API

The most visible change is the interface layer. Human-operated software centers on a graphical interface. The dashboard is the product. Users log in, look at charts, click buttons, and configure settings through forms.

Agent-operated software centers on the API. The API is the product. Everything the agent needs to do must be expressible through programmatic calls with structured inputs and outputs. The dashboard still exists, but it becomes a monitoring tool for humans overseeing the agent — not the primary interface for operating the system.

This sounds obvious, but the implications run deep.

Building for agents means the API is the first-class product. Every capability available in the GUI must be available through the API. Responses must be structured, consistent, and self-describing. Errors must be specific enough for the agent to decide what to do next without human interpretation.

Structured errors, not friendly messages

Human-facing error messages optimize for readability: "Something went wrong. Please try again later." This is useless to an agent. The agent cannot "try again later" unless it knows what went wrong, whether the error is transient or permanent, and what specific action might resolve it.

Agent-facing errors need three properties.

Machine-readable classification

Not a string the agent has to parse, but a typed error code that maps to a specific failure mode. rate_limit_exceeded is actionable. bad_request is not. insufficient_permissions:scope=write:resource=campaigns tells the agent exactly what is missing.

Retry guidance

Is this error transient? If so, when should the agent retry? A Retry-After header or a retry_after_seconds field in the response body turns a failure into a pause. Without this, the agent either retries immediately (making things worse) or gives up (losing work).

Resolution paths

When an error is not transient, the response should indicate what action would resolve it. "This API key lacks the campaigns:write scope tells the agent — or the human reviewing the agent's logs — exactly what to fix. "Unauthorized" does not.

Most software today returns errors designed for a human to read and interpret. Agent-operated software needs errors designed for a machine to parse and act on.

The trust gradient

Here is the design problem that most agent-oriented systems get wrong: they treat trust as binary. Either the agent operates fully autonomously, or a human approves every action. Both extremes fail.

Full autonomy fails because agents make mistakes. They hallucinate. They misinterpret context. They take actions that are technically correct but wrong in ways that require judgment to detect. An agent that can send emails autonomously will eventually send a bad email. The question is not whether this happens but how much damage it causes when it does.

Full human approval fails because it defeats the purpose. If a human reviews every action, the agent is just a suggestion engine with extra steps. The human's time is not saved. It is merely redirected from doing the work to reviewing the agent's work, which is often slower because reviewing requires context-switching.

The right design is a trust gradient — a system where the level of human oversight varies by action type, confidence level, and track record.

Low-risk actions with high confidence auto-execute. Publishing a social post that follows a template the human has approved before. Sending a transactional email triggered by a well-defined event. Updating a CRM field based on unambiguous data.

High-risk actions require approval. Sending outreach to a new segment. Publishing content that makes a claim not previously approved. Taking any action that involves money or legal commitments.

The gradient shifts over time. As the agent demonstrates reliability in a category, the threshold for auto-approval lowers. Actions that initially required human review graduate to auto-execution. The system earns trust through demonstrated competence, not through a configuration toggle.

The approval flow is the core UX

In human-operated software, the primary interface is the dashboard — the place where the human understands the system's state and takes action. In agent-operated software, the primary interface is the approval flow — the place where the human reviews the agent's proposed actions and decides which ones to authorize.

This is a different kind of interface design problem. The approval flow needs to make the agent's reasoning legible. Why did the agent propose this action? What inputs led to this decision? What would happen if the action executes? What are the alternatives?

A good approval flow surfaces context, not just content. "Approve this email" is insufficient. "Approve this email, which was triggered by [this event], targets [this segment] because [this reasoning], and will reach [this many people]" gives the human enough information to make a judgment call without having to reconstruct the agent's logic from scratch.

Batch review matters too. If the agent produces twenty pieces of content, the human should not have to evaluate each one in isolation. Group them by type, highlight deviations from established patterns, and flag the ones that need attention while auto-approving the ones that match previous approvals. The goal is to minimize the human's cognitive load per approval decision while maintaining their ability to catch errors.

The quality of the approval flow determines whether an agent-operated system actually saves time or merely creates a new kind of busywork.

Observability over interactivity

Human-operated software is interactive. The user is present while the system runs. They see what is happening in real time and intervene when something goes wrong.

Agent-operated software is asynchronous. The agent acts while the human is absent. Hours or days may pass between the agent's action and the human's review. The human needs to understand what happened after the fact, not during execution.

This makes observability the primary design concern. Every action the agent takes should produce a structured log that answers four questions: what happened, why it happened, what the outcome was, and what should happen next.

"Sent email" is a log entry. "Sent email to segment high-usage-trial (142 recipients) because feature webhook-retries was released 2 days ago. Open rate: 34%. 3 replies received, 1 flagged for human follow-up" is observability. The difference is the difference between a system you can monitor and a system you can only hope is working.

Observability also means the human can audit the agent's decision chain. When a mistake happens — and mistakes will happen — the human needs to trace the path: which input triggered the action, what reasoning the agent applied, where the logic went wrong. Without this trail, every mistake is a black box. You know something went wrong but not why, which means you cannot prevent it from happening again.

Building for composition

Agents do not use one tool. They compose many tools into workflows. An agent might read data from your CRM, generate content using an LLM, submit it through your distribution platform's API, and log the results in your analytics system. Your software is one node in a larger graph of tools the agent orchestrates.

This has design implications. Your API needs to be composable — meaning its inputs and outputs should be structured in ways that flow naturally into other systems. Return data in formats that other tools can consume without transformation. Accept inputs that other tools can produce without adaptation.

Webhooks matter more than they used to. In a human-operated world, the user polls the dashboard for updates. In an agent-operated world, the system needs to push events to the agent so it can react. Webhook reliability — retries, ordering guarantees, delivery confirmation — goes from nice-to-have to critical infrastructure.

Idempotency becomes essential. Agents retry. Networks fail. Events arrive twice. Every write operation in your API should be safely repeatable. If an agent sends the same request twice because it did not receive a response the first time, the system should produce the same result, not duplicate the action.

What this means for builders

If you are building software today, the question is not whether agents will operate your product. The question is when, and whether your product will be ready.

The transition will not happen all at once. It will start at the edges — simple, repetitive tasks where the cost of a mistake is low. Scheduling social posts. Updating CRM records. Sending templated emails. These are the tasks agents will take over first because they are well-defined, low-risk, and high-volume.

As agents prove reliable at the edges, they will move toward the center. More complex decisions. Higher-stakes actions. Tasks that today require human judgment will gradually shift to agent execution with human oversight.

The builders who prepare for this transition — by designing APIs as first-class products, building trust gradients into their permission systems, creating approval flows that respect human attention, and investing in observability — will have products that agents can operate effectively. The builders who do not will have products that agents struggle with, work around, or replace.

We built nacre.ai with this model from the start. The system operates autonomously in the background, escalates to humans through approval flows when confidence is low, and provides full observability into every decision and outcome. Not because we predicted a trend, but because distribution is exactly the kind of high-volume, pattern-driven, measurable work that should not require constant human attention.

The best software disappears into the background. It runs, it learns, it surfaces only what needs human judgment. That is what agent-operated software looks like. The tools and patterns to build it exist today. The question is whether you start now or wait until your users' agents force the issue.

The operator is changing

That assumption is breaking. Not in the future. Now.

The dashboard is dead, long live the API

This sounds obvious, but the implications run deep.

Structured errors, not friendly messages

Agent-facing errors need three properties.

Machine-readable classification

Retry guidance

Resolution paths

Most software today returns errors designed for a human to read and interpret. Agent-operated software needs errors designed for a machine to parse and act on.

The trust gradient

Here is the design problem that most agent-oriented systems get wrong: they treat trust as binary. Either the agent operates fully autonomously, or a human approves every action. Both extremes fail.

The right design is a trust gradient — a system where the level of human oversight varies by action type, confidence level, and track record.

High-risk actions require approval. Sending outreach to a new segment. Publishing content that makes a claim not previously approved. Taking any action that involves money or legal commitments.

Building Software for Agents

The operator is changing

The dashboard is dead, long live the API

Structured errors, not friendly messages

The trust gradient

The approval flow is the core UX

Observability over interactivity

Building for composition

What this means for builders

Stay in the loop

Recent posts

The Loop: Why Distribution Data Is Your Best Product Signal

Introducing nacre.ai

The Distribution Pipeline

Building Software for Agents

The operator is changing

The dashboard is dead, long live the API

Structured errors, not friendly messages

The trust gradient

The approval flow is the core UX

Observability over interactivity

Building for composition

What this means for builders

Stay in the loop

Recent posts

The Loop: Why Distribution Data Is Your Best Product Signal

Introducing nacre.ai

The Distribution Pipeline