Like many software developers, I’ve recently been puzzling out my attitude towards this “AI” stuff, by which I mean Large Language Models and all the associated tooling. 6 months ago (mid-2025) I was still a sceptic about every aspect of “AI”. And where creative endeavour is concerned, I’m still just as sceptical as I was before; see the end of this essay for more on that. I’m not even going to use the term “AI” outside quotes in this essay, because what we have at the moment is not “Artificial Intelligence”: it’s better described as “extremely sophisticated pattern matching”.
But as far as coding goes? I was flat-out wrong. In that particular domain, the new agentic tools (Claude etc), are – when properly guided – tremendously effective. I no longer have any doubt that they will have a fundamental, permanent impact on the software development industry.
Some terminology for what follows:
- Large Language Models (LLMs): the underlying approach. Imagine something that has been trained on a vast corpus of text. The model is unchanging, but it has a “context window” that you can ask it questions in, and it will answer. This may be very large – millions of words – but it will eventually end. There is no consciousness, and no permanent learning.
- Agentic Code Generation (ACG): the use of LLMs, with additional scaffolding, to create computer code.
What does ACG change about software development?
The gist: if you already know what you want the program to do, ACG can often write it for you, or at least do a very creditable and time-saving first draft. This is very significant: while it doesn’t eliminate software development as a role, ACG can shift where output is constrained: the constaints move from code generation to other parts of the process; idea generation, code review. Where ACG can have this effect, then developers who can enage effectively with this fundamentally different structure of constraints will become more economically valuable than ever, since their overall output will be a multiple of what it was before (2x? 3x? 10x? depends on the field) – and developers who can’t so engage will have an increasingly difficult time of it.
As a guide to what work will get swallowed up: if it could be done by a smart intern with access to Google/ StackOverflow/ etc, it’s a good potential fit for ACG. In the future this heuristic will scale upwards to newbie programmers: the sea is rising.
What will ACG-enhanced software development look like?
It’s tempting to treat the introduction of ACG into software development treats as being just another iteration of a very familiar story; ascension through successive levels of abstraction. For instance, everyone’s used to languages doing more for you than they used to, e.g. shifting from assembler to C to C++. And no-one writes their own messaging system from scratch anymore (well, hardly anyone) – we have open-source libraries and applications for that.
But it’s a different story this time – an older one – the replacement of unruly, messy humans by biddable, predictable automata. I’ve seen several failed attempts at making this happen in software development over the years; but it’s finally happening now: if not completely, at least to an significant, unprecedented extent. ACG will take over a substantial part of the overall activity of code generation in many areas, while human attention will become increasingly concentrated on other concerns: direction, coordination and verification.
This reallocation of human attention has a wide range of beneficial effects: it doesn’t just make it quicker to write exactly the same computer programs that we would written anyway. Detailed root cause analysis of weird glitches, hangs and crashes is suddenly way easier to kick off. There’s a far lower cost of experimentation (and a lower level of counter-productive emotional investment in the resulting prototypes). There’s also a lower cost to scratching minor itches – all those small fixes and improvements no-one normally gets around to, because you’d have to spend a couple of hours understanding the surrounding code in order to figure out exactly what to tweak, and who has the time?
And, less positively, there is also the spectre of vibe-coding hell: people getting ACG to spit out code that makes it into production without due scrutiny of security holes, or without sufficient effort made to prune and refine changes to avoid the codebase filling up with slop. Agentic code generation has the potential for enabling those with a shaky grasp on the principles behind sustainable software development to create a huge mess far quicker than they ever could before.
It’s too early to say exactly how this will all pan out. There will be great success and wince-inducing disasters. But one thing is clear, to me at least: any software developer who doesn’t develop an understanding of how to get value out of ACG is likely to have a tough time over the next few years.
What factors will influence the effectiveness of ACG?
While I’m seeing a multiplier when using ACG for my own work, this clearly is not true for everyone. Here are some factors that will affect how effective ACG can be for you:
- tooling quality: how good are the prompts and the agentic structure? Ideally your ACG tooling can generate tests and verify functionality is correctly implemented.
- language: strongly typed languages (Rust, Functional languages such as MLs/Haskell) are likely to do particularly well out of ACG, which can leverage the structure offered by typesystems.
- process quality: just because you have a robot code generator, all the other aspects of software development don’t magically vanish. There is still a need for diligent reviews, version management, design sensibility, effective requirement gathering, and all the rest.
- domain width and stabliity: if you spend all your time on a relatively well-defined code base that you have a deep understanding of, then you may find ACG has less to offer you. If your work ranges across a very wide range of rapidly-mutating technologies and code bases, the reverse will be true.
If you have all these factors working in favour of ACG, then large multipliers are achievable. But if they’re all working against you – if you’re using poor-quality tooling to blast in a load of code that has inconsistent idioms (say, a random selection of C++ styles from the last 30 years), without reviewing it properly, in a modelling library that’s your main product and which you really should have a deeper understanding of than you do – then ACG will almost certainly seriously impair your overall output.
Prediction 1: In the immediate term, ACG won’t be for everyone, even many good development outfits (e.g. good tools and process (+ACG), but also lots of acquired knowledge over a well-bounded stable domain, and a bad match on language (-ACG) ).
Prediction 2: in the longer term, ACG will drive a shift towards languages that work well with it. Examples of langauges that may come under pressure: (1) C++, where the idioms have shifted a lot over the last 30 years – how well can ACG cope with that? And it’ll make Rust easier to write as a replacement… (2) Dynamic programming languages such as Python: here ACG assistance can potentially tip the balance in favour of more strongly-typed contenders.
Prediction 3: across all domains and time horizons, the boundary of ACG effectiveness will expand as the quality of tooling improces.
It’s not just software development …
The rise of LLMs is going to disrupt – even replace – plenty of other jobs. For example, until recently there was a useful living to be made by linguists, doing legal patent searches in foreign languages. Nowadays, Google Translate (or similar) will work well enough, and the humans are no longer in the running.
A more nuanced example: as of Dec25, there’s some heated debate going on about to what degree LLMs might be able to facilitiate/replace the work of human indexers of books. The American Society of Indexers say LLMs are useless. Here I can see agentic assistance being a net win, but human intervention will still be necessary to reliably achieve a result that is useful to other humans. My intuition here comes primarily from the related field of topic modelling, which has had a lot of research (predating LLMs) and where the fashion in the research community on how to assess the quality of output has oscillated between programs (consistent, but veer off in weird directions) and humans (who do at least make sense to other humans, but often don’t agree with each other on what the topic categorisation should be).
.… but the creatives are not getting replaced any time soon
The scope of current technology to substitute for humans is still limited: Large Language Models are not about to somehow replace creativity, whether literary, musical, visual, or of any other kind. “AI” boosters overclaim in this area, which was why I have been so sceptical about all their other claims (including those about ACG) until recently. Writing useful code is not the same as creating art! You can verify whether code is doing the right thing by running tests!
Why is code different from Art? There are “right answers” for many coding problems – if not a single answer, then a relatively small number of well-defined solutions, with an understood range of tradeoffs between them. So using ACG to write programs is targetting a far simpler space of potential solutions than the wider, more ambiguous, realms of Art. While there is some creativity associated with the act of code generation, it is subserviant to the only really important creativity-oriented question about the code, which is: why are you writing this code in the first place? What is its purpose? And LLMs can’t answer that.
Will algorithmic systems ever achieve some element of artistic creativity? Maybe, if so it will need different technology than the current language models, which are not robust to further major re-training after their initial creation. We currently get around this limitation by providing a large “context window” of interaction with that model. But when you exhaust the capacity of the context window, you’re done, and the model itself doesn’t learn a thing. Show me a model that can robustly incorporate feedback about how its actions are affecting the world, and then I’ll be interested: but a context window is no replacement for that ability, no matter how wide it is.
The illusion that ChatGPT and its relatives have a mind can be very compelling: people are easily misled by the context window into thinking they are talking to something with a mind of its own. But you can stick a pair of googly eyes on a rock, and throw your voice, and get the same effect. We humans are hard-wired to impute intentionality; this has probably been very useful when dodging predators in the distant past, but serves us poorly in this case.
TL;DR:
- ACG has seriously high potential value. If you’re a developer, I strongly advise spending some time engaging with it in 2026, if you haven’t started doing so already. ACG isn’t a magic bullet and may not fit your domain. But you should figure that out for yourself, rather than either accepting ACG uncritically, or dismissing it without thought.
- There are plenty of other fields where Large Language Model technology will supplant humans – typically where there is a large rote element to the work, with little scope for imagination.
- If someone tells you that current “AI technology” can replace human creativity and insight, you should smile politely, and count your silverware once they have left the room. For a more frank, if impolite, take on some of the current hucksters in this sphere, I recomment reading I am an AI Hater.
- It’s possible that we might – off the back of different technology – develop true AI: algorithmic creations that can learn based on feedback from the real world, and use their learning to do useful things. I am less sceptical about that possibility than I was five years ago. This is not necessarily a good thing: read Lena for a well-written take on why.






