When to Use an AI Agent Instead of Another Script

Most automation problems don't need an AI agent. They need a script. I've been writing automation for 20 years, and the thing that's annoyed me most about the past two years of AI buzz is the assumption that everything suddenly needs a "smart" solution when a cron job and some grep would solve it faster.

That said, there are real use cases where agents add value. The question isn't "is this shiny?" It's "does this actually reduce maintenance burden or just add another thing to debug at 2am?"

Let me break down when one makes sense and when you're just adding complexity for no reason.

—

What an AI Agent Actually Is (And What It Isn't)

Before we decide anything, let's get precise. An AI agent in this context means a system that uses a large language model to:

Decide actions based on context (not hardcoded logic)
Maintain state across multiple steps
Handle ambiguous inputs without crashing
Potentially call external tools or APIs autonomously

A script is a deterministic sequence of instructions. Input A always gives Output B. You can trace every line.

An agent is probabilistic. It might give you Output B, or it might give you something close enough that you don't notice until three weeks later when your monitoring catches drift.

Both have places. The problem is people reach for agents when they want to feel modern, not when they need capability.

—

When Scripts Still Win (Most of the Time)

If your problem has clear inputs, known outputs, and defined steps, write a script. Here's why:

You can debug it. When a script fails, you read the error, fix the line, run it again. When an agent fails, you might not know why it decided to delete the wrong directory or respond in Portuguese.

It doesn't hallucinate. A script won't make up a configuration value. An agent might "fill in" something that looks reasonable but is completely wrong.

It's faster to write. A Python script that parses logs and emails a summary takes me 30 minutes. An agent that does the same with reliable output takes half a day of prompting and testing.

Maintenance is predictable. Your script breaks when inputs change. An agent breaks when the model updates, when the API changes, or when it encounters a slightly new edge case it wasn't trained on.

Scripts win for: log rotation, backup verification, user provisioning, health checks, metric collection, certificate renewal, anything with known patterns.

—

When an Agent Actually Makes Sense

Now for the cases where I've actually reached for an agent and been glad I did:

Unstructured data at scale. We had 10 years of accumulated config files, some in different formats, some with partial data, some with comments that mattered. Writing a parser for each variant would have taken weeks. An agent with the right prompt processed all of them in an afternoon and flagged the 15% that needed human review.

Natural language inputs. We built an internal tool where support folks type requests like "unlock my account and check if there's anything weird in the logs from the past hour." Doing that with regex and case statements is a nightmare. An agent can parse intent and call the right APIs.

On-call summarization. This one surprised me. Our on-call notes are messy—different people, different formats, some details in chat, some in tickets. An agent that pulls from multiple sources and generates a morning briefing has actually been useful. It misses things, but it gets 80% there and saves 20 minutes per shift.

Multi-system orchestration with ambiguity. When you're dealing with systems that don't have clean APIs, don't log consistently, and sometimes require human judgment, an agent can hold context across steps in a way a script can't.

The pattern in all these: ambiguous inputs, multiple data sources, or cases where the "right" answer requires judgment rather than calculation.

—

A Real Example: Our Content Site Workflow

I run a content site on the side. For a while, I tried using an agent to handle the workflow: scrape feeds, rewrite summaries, schedule posts, respond to comments.

It was a disaster.

The agent would occasionally rewrite headlines in ways that changed meaning. It would miss moderation cues. It would schedule posts at weird times because it interpreted "morning" as 3am in my timezone.

I switched to scripts for the heavy lifting—scraping, formatting, scheduling. The scripts are boring and reliable. I kept an agent only for one specific task: generating multiple headline variants so I could pick the best one.

That worked. One narrow task, human in the loop, clear success criteria.

The lesson: agents are good at ideation and first drafts. They're bad at execution where you need guarantees.

—

Failure Modes Nobody Talks About

Here's what breaks in production with agents:

Model drift. You test with GPT-4o in March, it works great. In June, an update changes behavior slightly. Your agent suddenly starts failing edge cases it handled before. You might not notice until users complain.

Prompt injection. If your agent processes any external input—user requests, scraped content, API responses—someone will eventually try to exploit it. I've seen agents leak context or execute instructions embedded in supposed "user names."

Cost surprise. Agents aren't free. Each run costs money, and if you scale up without watching, you get a bill that makes you wince. Scripts run on cron for free.

Debugging is hard. When your script fails, you get a stack trace. When your agent fails, you get "the model didn't produce the expected output" and no clear path to fix it.

Hidden state. Agents can accumulate context that affects future runs in ways you don't track. I've seen agents that seemed to work fine for weeks then started behaving differently because of accumulated assumptions.

—

What I Would Do First

If you're looking at a problem and wondering if you need an agent, start here:

Write a script first. You'll learn the problem space. You'll identify the hard parts. You'll probably solve 80% of it with straightforward code.

Only add an agent for the 20% that's genuinely ambiguous. The messy inputs, the natural language, the "it depends" decisions.

Keep a human in the loop. Let the agent generate drafts, summaries, suggestions. Let it do the boring research. But have a person press the button on anything that matters.

Monitor cost from day one. Set alerts. Track runs. Know what you're spending before you scale.

Plan for maintenance. Agents need prompt reviews, model updates, and retesting. Budget time for this. It's not "set it and forget it."

If after all that, a script still solves your problem—use the script. The goal is getting work done, not using the trendiest tool.