Coders spent more time prompting and reviewing AI generations than they saved on coding. On the surface, METR’s results seem to contradict other benchmarks and experiments that demonstrate increases in coding efficiency when AI tools are used. But those often also measure productivity in terms of total lines of code or the number of discrete tasks/code commits/pull requests completed, all of which can be poor proxies for actual coding efficiency. These factors lead the researchers to conclude that current AI coding tools may be particularly ill-suited to “settings with very high quality standards, or with many implicit requirements (e.g., relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn.” While those factors may not apply in “many realistic, economically relevant settings” involving simpler code bases, they could limit the impact of AI tools in this study and similar real-world situations.
The main issue i have with AI coding, hasn’t been the code. Its a bit ham fisted and overly naive, it is as if it’s speed blind.
The main issue is that some of the code is out of date using functions that are deprecated etc, and it seems to be mixing paradigms and styles across languages in a very frustrating? way.
True and not true at the same time. Using agents indeed often don’t work, mostly when I’m trying to do the wrong thing. Because then, AI agent does not say “the way you do it is overly complicated, it does not make any sense”, but instead it says: “excellent idea, here are X steps I need to do to make it happen”. It wasted my time many times, but it also guided me quickly though some problems that would take hours to research. Some of my projects wouldn’t have been finished without AI.
Some of my projects wouldn’t have been finished without AI.
This says way more about you than it says about AI tools
Their sample size was 16 people…
I got flamed pretty hard for pointing out that this sample size really needs to be in the title, but it needs to be said. Thank you. Sixteen people is basically a forum thread, and not a very popular one.
It’s still useful information and a good read, but a lot of people don’t click through to the article, they just remember the title and move on.
They can’t read your mind. A professional painter is going to make the exact image they want in far less time and with more accuracy than repeatedly prompting a black box to make small changes.
But if you’re an amateur and don’t really know what you want, or you’re not very picky or care about quality, then meh good enough. High level software developers know what they want. They are like painters. And at that point, the LLM isn’t really solving problems for you. At best, it’s putting the paint to the canvas. That is, saving you typing time.
But time spent typing is definitely not the limiting factor for productivity in software.
They can’t read your mind. A professional painter is going to make the exact image they want in far less time and with more accuracy than repeatedly prompting a black box to make small changes.
and this is the exact reason why I hate IDEs that relentlessly “do things” for me.
I don’t need my editor maintaining my includes or updating my lock files. I don’t need them to auto complete words or fix syntax for me.
I know exactly what I’m doing. If I don’t then-- AND ONLY THEN, will I lookup what I need and fix it myself.
if there’s a problem with formatting a linter will pick it up. if there’s a problem with syntax the runtime/compilation will pick it up. if there’s a problem with content uat will pick it up.
we don’t need to be MORE productive, we need to be more skilled and using tools like these only soften the mind and dull the spirit.