5 min read

Agentic Software Development: How My Team Actually Ships Code in 2026

Written by

Marko Balažic

Updated on

May 6, 2026

Agentic software development is the cleanest description of how my team works in 2026. Agents do the work. Engineers supervise. The unit of progress is no longer 'lines of code I typed today' but 'what I successfully delegated and verified.'

I've been writing software for 15 years. The shift in the last 18 months is the biggest tooling change I've seen in my career, including the move from Subversion to Git, the rise of containers, and the cloud transition. None of those changed the daily rhythm of writing code as deeply as agents have.

This is what agentic software development actually looks like inside a working studio — the patterns, the failure modes, and what I tell new engineers at Shape on day one.

What 'agentic' actually means in software work

The word gets thrown around a lot. Stripped to its core: an agent is a system that reads a goal, decides on a sequence of actions, executes those actions using tools (file edits, shell commands, web searches, API calls), and adapts based on what it sees.

The thing that makes a development workflow 'agentic' isn't whether you use Copilot. Copilot autocompletes. That's helpful but not agentic. Agentic is when you say 'fix the failing tests in the auth module' and walk away while the system reads the failures, edits the code, runs the tests, sees what still fails, edits again, and reports back when it's green.

The difference is who's holding the loop. With autocomplete, you're holding the loop and the model is helping you type. With agents, the agent is holding the loop and you're verifying the result.

What changes day-to-day

Five things change when your team adopts agentic development properly. None of them are obvious from the marketing.

1. The unit of work shrinks and the unit of supervision grows

I used to spend my day writing code. I now spend my day writing prompts, reviewing PRs from agents, and writing evals. The total amount of code shipped per week is up roughly 4–6x. The amount of code I personally typed is down 80%.

This is a different job. Engineers who treat agents like a smarter autocomplete leave most of the leverage on the table. Engineers who treat agents like junior reports they're managing get the full benefit.

2. Specs become load-bearing

Vague tickets used to be fine because the engineer would interpret. Vague prompts to an agent produce confidently wrong code. So the spec writing happens up front, in detail, and that's where most of the thinking now lives.

Practical version: a good 200-word spec to an agent is worth more than 2 hours of pair programming. Most engineers haven't internalized this yet.

3. Tests get written first, finally

Agents are extraordinary at writing code that compiles. They are mediocre at writing code that's correct. The way to bridge the gap is automated verification — unit tests, integration tests, eval suites — written before or alongside the work, not after.

TDD has been preached for 20 years and adopted by maybe 15% of teams. Agentic development forces it because it's the only way to trust the output without reading every line.

4. Code review changes shape

Most code review comments used to be style and small mistakes. Agents nail style and small mistakes. The valuable review now is at the architectural level: 'are we doing the right thing?' and 'what does this break that the agent didn't see?'

This is harder, slower per PR, but it's the right kind of hard. Senior engineers become more leveraged, not less.

5. The on-ramp for new engineers gets weirder

An engineer with 2 years of experience and good agent skills now matches the output of a 5-year engineer without them. But that 2-year engineer might not understand why the code works, only that it does. Long-term, that's a problem.

We compensate by requiring engineers to read every line they ship and explain why it's there — not in PRs, in conversation. Agents accelerate output. They don't replace understanding.

The agentic stack we use at Shape

This changes every quarter. Today (May 2026):

Primary agent: Claude Code, run in the terminal. Most senior engineers also keep Cursor open for inline edits.
Eval framework: Custom, lightweight, basically a Python harness over a JSON test bank. We tried Braintrust and the like — too heavy for our scale.
Orchestration: n8n for workflows that involve agents, APIs, and humans. Cron + Lambda for pure background jobs.
Memory and context: per-project CLAUDE.md and AGENTS.md files in the repo. Skills directory for repeat workflows. Versioned in git like any other code.
Code review: Pull request first, agent review for nits, human review for architecture, merge gates on tests + lint + a simple eval pass.
Production monitoring: agents in the loop on incident triage but never on automated remediation in prod. Humans approve every push.

The patterns that work

Five concrete patterns we lean on every day.

Spec → plan → execute

Don't ask an agent to do the thing. Ask it to write a plan first. Review the plan. Then ask it to execute. This catches 80% of the misunderstandings before any code is written.

Tight loops with verification

The shorter the cycle between 'agent edits code' and 'verifier runs tests,' the better the result. Aim for sub-minute cycles. Anything over 5 minutes degrades fast because the agent loses context.

Subagents for parallelizable work

Long tasks split into independent subtasks run as parallel subagents. The orchestrator agent coordinates. We use this for things like 'generate hero images for these 5 articles' or 'add type definitions to all 12 files in this module.'

Skills, not snowflakes

Repeated workflows get codified as 'skills' — named, versioned, reusable prompts with instructions, examples, and the right tool access. We have skills for blog publishing, image generation, eval runs, deploy checks. Each skill is one file, kept in git.

Plain-text everything

Specs, plans, evals, prompts, postmortems — all in plain Markdown in the repo. Agents read it, humans read it, search works on it, version control works on it. The further you go from plain text, the harder agents work.

What still breaks

I'm bullish on agentic development but it isn't magic. Three things still go wrong regularly.

Hallucinated APIs

Agents invent function signatures and library calls that don't exist. Mitigation: ground them in your actual code via search, link the library docs, and run the code before declaring done.

Confident wrongness on edge cases

The 90% case ships beautifully. The 10% edge case has a subtle bug the agent didn't think to test. Mitigation: write the edge cases first, as failing tests, and let the agent make them pass.

Context bleed across sessions

Long sessions accumulate context that pollutes later decisions. Mitigation: start a fresh session for each major task. Treat session memory as a temporary scratchpad, not a database.

What this means for hiring

I'm increasingly hiring for two skills that didn't matter as much three years ago:

Spec writing. Can you describe a complex change in clear, unambiguous prose? Lots of strong engineers can't, and they struggle with agentic workflows.
Verification design. Can you build the test or eval that catches the failure mode? This is the meta-skill that separates engineers who get 5x leverage from agents from those who get 1.5x.

Languages and frameworks matter less. The agent can write in any of them. What it can't do well is decide what 'correct' means in your domain.

The honest summary

Agentic software development is the new default. Within 18 months, every functioning engineering team will work this way or be outshipped by teams that do.

If you're a founder shipping AI software in 2026, hire engineers who already work this way — or hire a studio that does. Building product the old way isn't safer; it just looks safer because you can see the cost of every line of code you typed.

If you want to see what a team running fully agentic looks like in practice, book a call. Happy to walk through the workflow live.

Written by Marko Balažic, founder of Shape — an AI venture studio whose team runs agent-first across product, design, and engineering. Reach out to talk shop.

Building something?

Let's talk about your idea — no pitch, no pressure. Just an honest conversation about scope, timeline, and how to ship fast.

Thank you! Your submission has been received! ✨

Oops! Something went wrong while submitting the form.

Book a free call →