Skip to main content
Published 2026-05-04

7 min read

How AI fits into engineering workflows

Part 1 of 3. AI works best inside a well-defined engineering process, where the team still owns the direction of the work and the model supports what is already underway. The same review standard that applies to human-written code applies to anything a model produces.

TL;DR

  • Use AI for repetitive implementation, scaffolding, refactoring support, and early exploration where outputs map cleanly to known patterns.
  • Treat rapid generation as a reason to keep reviews strict, not as a reason to skip them; every change still needs a named technical owner.
  • Match how freely you use AI to the phase of work, with looser use in exploration and tighter rules in stabilization and production.

Share this article

Corsair Media Group

Corsair Media Group

Where AI helps inside engineering work

Copied

AI works best inside a well-defined engineering process. The team still owns the direction of the work, and the model supports what is already underway.

The places where it tends to earn its keep are repetitive implementation, scaffolding, refactoring support, and early-stage exploration. These are the areas where speed matters more than originality, and where the output can be checked directly against patterns the team already knows.

Rapid generation is what makes AI useful, and it is the same thing that makes it dangerous the moment the output gets treated as authoritative. In production, the question that matters is whether every change passed through a named technical owner who understands the system well enough to take responsibility for it. Whether AI was involved in writing the change sits well below that question.

Where the work actually lands

Copied

Teams use AI on the work that used to accumulate on the backlog and get cleared after hours. That backlog typically includes boilerplate, test scaffolding, and first drafts of internal documentation. None of those items require originality, and all of them benefit from a quick first pass that a reviewer can accept or rewrite.

The same tools help with knowledge gathering. Instead of a long first read through official documentation, an engineer can ask a model a concrete question and refine the answer with follow-up questions. On a dense topic, that approach can be faster than reading the documentation alone.

Models can also serve as a sounding board while you debug, weigh design trade-offs, or learn a codebase you did not write. The output still needs to be verified against the system you actually run, but the discussion itself can reduce hours of orientation work to minutes.

The failure mode worth memorizing is this one. Models are most likely to lie to you when you ask them a question they almost know the answer to. The response arrives well-formed and confident, and it is wrong in a way that only someone who has actually run the code can catch. Fabricated function signatures, hallucinated config keys, and method calls that never existed are the everyday version of this. The model is a draftsman. The engineer is the source of truth.

AI can reduce mechanical work. The system understanding still has to come from the people running the system. Engineers define correctness, decide structure, and validate behavior under conditions that no model has ever seen, because those conditions are particular to your production environment.

What the surveys say about repetitive coding

Copied

Most of the published figures in this space measure time saved on routine work, and most of them stop short of saying what routine actually means. Take every chart below with the understanding that nobody polled your repository, your auditors, your legacy database constraints, or your security review backlog. Your team's definition of routine is almost certainly different from the survey's, and the gap can be wide enough to swallow most of the claimed gain.

Large 2026 developer surveys summarized by firms such as McKinsey report generative AI cutting roughly 46 percent on average from time spent on labeled "routine coding." What counts as routine varies enough between sources that the headline number is best read as a vibe, not a guarantee.

What the McKinsey-flavored survey says about routine coding

Public McKinsey synthesis around 2026 puts the average near 46 percent on "routine coding." We quote the headline rather than running our own survey, since the original definitions are not ours to redefine.

Average time shaved on repetitive coding work across reporting surveys~46%

Reports tied to GitHub and Microsoft typically report twenty-five to fifty-five percent faster throughput for senior engineers when the task fits the pattern. How much of that range you actually see depends far more on which patterns the engineer happens to be working on than on the seniority label, and a codebase that already removed its boilerplate with generators will have absorbed part of that gain long before any chat assistance arrived. The real question is whether the time you save covers the license fees, the API usage costs, and the extra review time that generated patches almost always require. With a low band of twenty-five percent and a high band of at least fifty-five, it probably does.

The range cited around GitHub and Microsoft data for grunt-work tasks

The bars are rounded for readability. The underlying numbers come from published discussions that themselves report ranges rather than fixed points.

~25% faster on mechanical churnLow band
~40% faster on mechanical churnMid band
~55% faster on mechanical churnHigh band

Exploration, stabilization, and production

Copied

AI use is rarely a single setting. Many teams use a three-stage pattern. Model use is heaviest during early discovery. Review and scaffolding tighten as revenue and operations start to depend on the system.

  1. Exploration. Heavy model use produces a large amount of code quickly while review is light. Teams use this phase to learn fast and abandon dead ends early. The first pass should not be treated as production-safe until a reviewer has approved it. A small scoped piece can reach an internal demo in a day, though that demo only represents code that runs on a developer machine, not code that has been reviewed and operated.
  2. Stabilization. Refactor what the model produced. Add or strengthen tests. Read diffs the way you would for a human author you do not fully trust yet. Tighten naming, integration points, and assumptions about input data while those are still cheap to change.
  3. Production. Change management is stricter than in exploration. Prompts are smaller on purpose. Work is sliced so that inference and automated patches cannot move faster than the people who will own the operational consequences when a regression slips through.

Each phase has a different definition of done. One level means the code runs on a developer machine. Another means it has been reviewed, is stable, and is safe to extend. Production means it is safe to operate under your real constraints, with whatever those constraints happen to be. AI does most of its useful work in the first phase, contributes a bit to the second, and almost never reaches the third without a person standing over it. Anyone selling you a workflow that skips that last sentence is selling you something other than reliability.

Parallel work with one accountable owner

Copied

A useful default is parallel execution with one accountable owner. One engineer keeps the main piece of work while an agent handles a separate, well-defined side task under that same engineer's review. Throughput goes up. Accountability stays with one person. Every delegated line is reviewed by a human before it merges.

Larger scaffolds and multi-file edits still work, provided the session is time-boxed, the review expectations are set before prompting begins, and large diffs never merge without a careful read. The skill that matters most is keeping the system organized so that there is less repetitive code in the first place. Raw typing speed has not been the constraint on a healthy team for a long time. The harder discipline is deciding what is worth automating in the first place.

Narrow delegation works better than full automation

Copied

The most reliable pattern we have run into is narrow delegation. Small, focused changes inside a system that already has clear architectural rules, with a human still on the merge. Full automation, where a model decides large parts of the system on its own, rarely holds up over more than a few weeks of real operation.

The next article in this series looks at what goes wrong when those rules are missing, and at the failure modes that grow as output volume grows.

If any of this looks familiar in your own work, then what would the right starting point look like for your team, and would you be open to discussing it through our contact page?

If you want AI scoped to repetitive work with engineers still owning every merge, then talk with Corsair about your next build.

Contact Corsair