22 May 2026 14 min read Guide

How to hire AI-augmented software engineers.

What you need to know about evaluating engineers in the age of AI-assisted software development.

Why This Guide Exists

If you're hiring software engineers in 2026 and still running the same traditional interviews you ran in 2024 to assess the same traditional engineering skillsets, you're filtering out the people you need most.

The best engineers now work differently. They direct AI agents, review output, and ship at a pace that wasn't possible two years ago. Ryan Dahl, creator of Node.js, said it plainly in January 2026: the era of humans writing code by hand is over. Steve Yegge wrote close to a million lines of code last year using AI tools - rivalling his entire 40-year career output. Gene Kim reports being 10x faster.

These aren't fringe voices. They're describing how the best engineers now work.

Right now, hiring engineers who work this way is a competitive advantage. Within a year, it will be standard practice. The companies adapting their evaluation methods now are getting first pick of this talent. The ones still running whiteboard interviews are filtering out the candidates they need most.

We've made over 300 critical engineering hires for pre-seed to Series C startups with outlier ambition. This guide distils what we've learned from watching this shift happen - and helping our clients adapt their hiring to capture it.

How to Use This Guide

If you have 5 minutes: Read "The Talent Landscape", "The Thesis", and "The Quick Version".

If you're preparing to interview: Read "How to Interview" and "A Practical Assessment". These give you exact questions and a working session format.

If you're deciding whether to hire someone: Read "What You're Actually Evaluating" and "Common Mistakes".

If you've just made a hire: Read "After the Hire" and "Competing for This Talent".

If you want to hire better engineers than everyone else in 2026: Read it all.

The Talent Landscape

Here's what we're seeing in the market. These categories are defined by how someone works, not how good they are at traditional engineering.

Category Zero: Not Realistic to Hire (Yet)

The defining characteristic: Engineers orchestrating many parallel AI agents simultaneously. Steve Yegge describes running 20-30 agents at once, treating tools like Claude Code as building blocks in larger automated workflows.

This level exists. But in Australia in early 2026, you're not going to find these people available for hire into a normal team environment. They're either building their own things or working in contexts that look nothing like traditional employment.

Note: If you're reading this in late 2026, this category may have become what Category One is today. Things move that fast.

Category One: Rare

The defining characteristic: They don't write code themselves anymore. Not physically, anyway. They direct AI agents, review output, and intervene when judgment is required - but the days of manually typing syntax are behind them.

They've developed the skill to break problems down, prompt effectively, manage context across sessions, and catch AI mistakes before they compound. Their velocity is multiples of what traditional engineering allows.

These engineers are genuinely scarce. You might interview twenty excellent traditional engineers before finding one. In Australia in early 2026, this category represents the best you'll realistically hire.

Category Two: Uncommon

The defining characteristic: They're meaningfully using AI, but they're still writing code themselves. AI is a tool they use - a significant one - but they haven't made the full transition to directing rather than writing.

They're developing judgment about AI limitations. They've changed how they work. They're getting real productivity gains, not just autocomplete convenience. But there's still a gap between how they work and how Category One works.

With the right environment and problems, they can reach Category One within months. These are the engineers worth investing in if you can't find or afford Category One.

Category Three: Most of the Market

The defining characteristic: They're "using AI" but their work looks essentially the same as it did two years ago. They might use Copilot for autocomplete or occasionally ask Claude to build or fix something, but there's no meaningful workflow change and no significant productivity gain.

Simple prompts, no verification discipline, no evolution in how they approach problems.

Without guidance from someone in the first two categories, they're unlikely to develop. And even with guidance, it's hard to trust they'll keep pace as tools continue to change.

For an ambitious startup, this third category is increasingly unhireable. Not because they're bad engineers - many are excellent at traditional development. But the gap between what they can produce and what a Category One engineer can produce is already significant and widening.

The Thesis

The way of working described in this guide will itself be outdated within one to two years. Possibly sooner. The tools are evolving that fast. Today's Category One will be tomorrow's Category Two.

Think of it as a train leaving the station. The train is the shift in how software gets built - and it's accelerating. You want engineers who are driving the train or who have already boarded. Not those still on the platform, watching it leave.

Category One and Two engineers will stay close to where things are going. When the tools change, they'll change with them. That's what you're hiring for.

You're not hiring for a fixed skillset. You're hiring for someone already on the train.

What You're Actually Evaluating

To be an excellent AI-augmented engineer, you first need to be an excellent traditional engineer. That's the prerequisite - the foundation. Without deep experience, you can't judge when AI is wrong, why it's wrong, or how to fix it.

But being an excellent traditional engineer does not make you a good AI-augmented engineer. Many exceptional engineers allow ego and attachment to how they've always worked to prevent them from adapting. They stay excellent at something that matters less every month.

You're looking for both: strong fundamentals AND genuine adaptation to new ways of working. One without the other isn't enough.

Here's why that matters:

The AI proposes a solution. Most of it is correct - but not all. A junior engineer might miss what's broken. A senior engineer working without AI would not create these problems manually, but would spend days building the working parts themselves. An AI-augmented senior engineer gets both: speed on the bulk of the work and judgment on the parts that matter.

Why Judgment Matters

Without experienced judgment, things go wrong quickly.

Steve Yegge shared a memorable example: he asked Claude Code to fix a failing test suite, and it did - by silently deleting 80% of the tests.

Gene Kim documented teams that, after just two weeks of AI-assisted development without proper oversight, produced code so opaque and brittle it required a full two-day stand-down to make it operable again.

The same velocity that lets you ship fast lets you create a haunted codebase fast. You need engineers who understand this.

What to Screen Out

Junior engineers using AI to mask inexperience. The time savings become senior engineer review burden.
Senior engineers who dismiss AI or use it only for autocomplete. They're leaving capacity unused.
Anyone who can't articulate specific limitations and failure modes. They haven't used the tools seriously.
Engineers without product sense. When building is fast and cheap, judgment about whether it should be built becomes even more important.

On Tech Stack

Language and framework familiarity matters less than it used to. AI-augmented engineers ramp up on unfamiliar stacks faster than traditional engineers could.

Domain knowledge and architectural thinking still matter. Specific language experience is no longer a hard filter.

Whilst all of this is true, be wary of engineers suggesting that all engineering is now ‘easy’. A front end engineer is not suddenly an excellent backend engineer, even with excellent use of AI.

How to Interview for This

Traditional interviews - whiteboard coding, algorithm puzzles, take-home projects - test traditional engineering skills. Those skills still matter as a foundation, but they're an increasingly smaller part of what makes someone effective. Testing only for them misses what actually differentiates candidates now.

You need different questions.

On Workflow and Tools

Ask: Describe your AI-driven development workflow. Walk me through how you'd approach a non-trivial feature from start to finish. What tools do you use, and why?

Good answers: Specific and opinionated. They name actual tools - Claude Code, Cursor, Copilot Workspace, specific MCP servers they've configured - and explain why those over alternatives. They describe planning with AI before coding, running multiple sessions in parallel, how they manage context across sessions. They acknowledge tooling changes fast and explain how they stay current.

Red flags: Vague references to "using AI sometimes" or "Copilot helps with autocomplete." This suggests minimal usage, and is a concern. If they can't describe a concrete workflow, they don't have one.

On Judgment and Limitations

Ask: Where have you found AI tools to be weakest? How have you worked around those weaknesses? Tell me about a time AI-generated code caused a problem you had to fix.

Good answers: They name specific failure modes - hallucinated APIs, incorrect assumptions about state, security patterns that looked right but weren't, tests that pass but don't test what they should. They describe guardrails they've built: testing strategies, review processes, categories of changes they don't let AI make unsupervised.

Red flags: Dismissing AI entirely ("I don't trust it for anything serious") or trusting it uncritically ("it's always worked for me"). The right answer lives in the middle.

On Planning and Prompting

Ask: How do you approach planning a significant piece of work with AI? Walk me through a recent example - how did you break it down, what did your prompts look like, how did the plan evolve?

Good answers: They describe meaningful planning time before code is written. They break work into stages, have the AI assess the codebase and constraints, ask clarifying questions, iterate on the approach. Their prompts are specific and contextual - they're having a conversation, not issuing commands.

Red flags: Simple prompts for complex work. If someone's approach to building a feature is "build me X feature" and waiting for output, they're not working effectively.

On Verification and Quality

Ask: What's your approach to testing AI-generated code? How do you decide what needs human review versus what you trust automated checks to catch?

Good answers: They understand AI is good at writing tests, so coverage should be high. They distinguish between tests that verify the AI did what it was asked and tests that verify what was asked was correct. They describe using type systems, static analysis, and linting to catch errors continuously.

Red flags: Trusting that "if it runs, it works" or not having a testing strategy. If someone isn't verifying AI output systematically, they're accumulating problems they haven't found yet.

On Trajectory and Learning

Ask: What's changed about how you work in the last six months? What do you expect to change in the next six?

Good answers: They describe specific evolution - new tools adopted, old habits dropped, experiments run. They have opinions about where things are heading, even if those opinions prove wrong. They're actively learning, not coasting.

The best have logged serious hours. It takes close to a year of daily use before you can reliably predict what an LLM will do and where it will fail.

Red flags: Their workflow is identical to six months ago, or they seem unaware that things are moving quickly.

A Practical Assessment

Questions reveal how someone thinks. A working session reveals how they actually work.

Design an assessment that tests the full cycle: planning, building, and reviewing.

Phase 1: Planning (30-45 Minutes)

Give them a well-defined problem and have them plan the approach with AI. Watch how they break down the work, how they prompt, how they iterate.

Example task: "We have a Node.js API with these endpoints [provide spec]. We need to add rate limiting per user with configurable limits stored in Redis. Plan how you'd implement this."

What you're looking for: Thoughtful decomposition. Someone who front-loads thinking, asks clarifying questions, and has the AI assess implications before writing code.

Phase 2: Building (15-30 Minutes)

Have them build something with AI assistance. Choose a task achievable with AI in this timeframe but impractical to complete manually.

Example task: "Implement the rate limiting middleware based on your plan, including tests and error handling."

What you're looking for: Whether they can direct AI effectively and maintain quality under realistic conditions.

Phase 3: Reviewing (20-30 Minutes)

Hand them a separate piece of AI-generated code and ask them to identify and fix the issues. Use code written by AI with reasonable prompting - not deliberately sabotaged, but containing typical AI errors.

Example: An authentication function where the AI has used > instead of >= for token expiry (an off-by-one error), missed a null check on req.headers.authorization that would crash with malformed requests, and implemented token comparison with === instead of a timing-safe comparison.

What you're looking for: Whether they can catch what AI gets wrong.

Setup Notes

Let them use their own environment if they prefer. If they use yours, give them time to configure it. You want to see how they actually work, not how they adapt to unfamiliar tooling under pressure.

Why All Three Phases Matter

Someone who plans well but can't execute hasn't developed the full skill. Someone who builds quickly but can't review will ship mistakes. Someone who reviews well but can't plan or build efficiently hasn't adapted.

Reference Checks

When speaking with references, ask questions that surface AI-specific signals:

On Output and Quality

"How would you describe the volume and quality of work they shipped?"
"What guardrails or review processes did they put in place for their own work?"
"Did you ever have concerns about code quality or technical debt from their contributions?"

On Adaptability

"How has their approach to engineering changed in the time you worked together?"
"How did they respond when new tools or approaches became available?"
"Were they someone who experimented with new methods, or did they stick to established patterns?"

On Planning and Judgment

"How did they typically start a new project or feature?"
"Can you describe a time they caught a significant issue before it shipped?"
"How did they handle ambiguous requirements or trade-off decisions?"

Common Mistakes

Hiring Juniors Hoping AI Will Compensate for Inexperience

It won't. AI output needs experienced judgment to evaluate. Without it, the "time savings" become time costs - senior engineers reviewing and fixing AI-generated code that a junior couldn't evaluate.

Hiring Seniors Who Are Dismissive or Surface-Level Adopters

Experience matters, but only if paired with adaptation. A senior engineer who dismisses AI or uses it only for autocomplete is leaving significant capacity unused. Worse, they may actively resist or slow down engineers who work differently.

Over-Indexing on Tools or Tech Stack

The tools will change. Specific tool knowledge becomes outdated quickly. What matters is effective workflows and adaptive thinking - someone who's mastered Claude Code today will master whatever replaces it tomorrow.

Assuming Your Organisation Is Ready

Hiring AI-augmented engineers into a slow product process will bottleneck their capacity. If your product decisions take weeks and your deployment pipeline doesn’t match their tempo, you'll pay for speed you can't use. The hire is only half the equation.

If You Can't Find Category One

The engineers doing this at a high level are rare. You may not find one, or you may not be able to afford them, or they may not be available when you need them.

If that's the case, hire from Category Two: engineers who are genuinely on their way.

Look for:

Clear evidence they've started adapting - real workflow changes, not just "I use Copilot"
Demonstrable gains they can speak to specifically
Strong fundamentals - the judgment AI requires comes from experience
Self-awareness about where they are in the transition
Evidence they've already evolved their approach as tools have changed

Then invest in their development:

Give them room to experiment with tools and workflows
Pair them with problems that reward AI-augmented approaches
Check in regularly on how their process is evolving
Don't mandate specific tools - let them find what works

Be careful not to confuse Category Two with Category Three.

Someone tinkering with AI but showing no real results after months of available tools is telling you something about their ability to adapt. The tools are getting easier - some of Category Three will eventually level up. But the question is whether you have time to wait.

For a VC-backed startup burning runway, you probably don't.

Hiring from Category Two is slower than hiring someone already there. But it's better than hiring someone who hasn't made meaningful progress - or waiting indefinitely for the perfect candidate.

Competing for This Talent

Category One engineers are scarce and know it. If you want to hire them, understand what they're evaluating when they evaluate you.

On Compensation

These engineers have significant leverage. They're producing multiples of what traditional engineers produce, and the market is beginning to price that in.

If you're a startup competing against well-funded companies or FAANG salaries, you likely can't win on base compensation alone. But compensation isn't the only factor (it is important though) - and for many Category One engineers, it's not the primary one.

What Actually Attracts Them

Interesting problems with real constraints. They can build fast - what they want is something worth building. Ambiguous, high-impact problems are more attractive than well-specified tickets.

Autonomy over tools and workflow. If you mandate specific tools or processes, you're signaling that you don't understand how they work. Let them show you what's possible.

Speed of decision-making. They can ship in days what used to take weeks. If your product process takes months to decide what to build, you're wasting their capacity.

Working with other adapted engineers. The best want to work with others operating at a similar level. If your team is mostly Category Three, that's a harder sell.

What Repels Them

Slow hiring processes (if you take four weeks to interview and offer, they're gone)
Interviews that prioritise manual coding ability rather than AI-augmented work
Rigid tool requirements or process mandates
Teams where they'd be the only one working this way
Product organisations that can't keep up with engineering velocity

After the Hire

Getting them in the door is step one. Setting them up to succeed is step two.

The First 30 Days

Don't mandate tools. Let them establish their own workflow and demonstrate how they work best. You hired them for how they work - let them work that way.

Pair them with product context. These engineers can build faster than most backlogs can keep up. If they don't understand what matters and why, they'll either sit idle waiting for decisions or build the wrong things fast.

Set expectations with the team. If they're shipping at a different pace than existing engineers, that can create friction. Be explicit about what you're optimising for.

Ongoing

Measure outcomes and impact, not activity. Traditional engineering metrics become misleading when someone can produce in a day what used to take a week. Lines of code, commits, and tickets closed don't mean what they used to.

Expect continued evolution. The tools change every few months. Their workflow in June won't match their workflow in December. Make sure your processes can adapt with them.

Protect their time. Meetings, reviews, and process overhead that made sense for traditional engineering velocity may not make sense now. Audit what you're asking them to spend time on.

Watch for the haunted codebase. Speed without oversight creates problems. Make sure code review and quality processes are keeping pace with velocity. If technical debt is accumulating faster than before, that's a sign something's wrong.

The Quick Version

If you're hiring software engineers in 2026:

1. Understand the Landscape

Category Zero (not realistic yet): Orchestrating many parallel agents. Exists but not hireable into normal teams in early 2026.
Category One (rare): Not writing code themselves anymore. Directing agents, reviewing output. The best you'll realistically hire.
Category Two (uncommon): Meaningfully using AI but still writing code. Can reach Category One with the right environment.
Category Three (most of the market): "Using AI" but work looks the same as two years ago. Not a fallback for ambitious startups.

2. Change What You're Testing

Stop testing manual coding ability.
Test AI fluency, judgment, verification skills, and whether they're on the train.
Run a practical assessment: planning, building, reviewing.

3. Ask Different Questions

"Walk me through your AI-driven workflow."
"Where have you found AI tools to be weakest?"
"What's changed about how you work in the last six months?"

4. If You Can't Find Category One, Hire Category Two and Develop Them

Look for real adaptation, not just "I use AI."
Invest in their development.
Don't confuse them with Category Three.

5. Prepare Your Organisation

Product decisions need to move faster.
Tooling mandates need to relax.
Metrics need to change.
The hire is only half the equation.

6. Compete on More Than Money Alone

Interesting problems, autonomy, speed, quality of team.
If you're slow, rigid, or process-heavy, they'll go elsewhere.

This is the biggest shift in how software gets built since the internet. The engineers who've adapted are scarce, and the companies hiring them effectively are the ones who've updated how they evaluate.

Matt Cook
Co-Founder of Scouut