Much has been written about coding with AI. And here is some more! Although I promise you that every word in this article is from my brain to your screen with no AI in-between.
Over the past few months, I’ve transitioned from using ChatGPT to help with researching and planning to also using Cursor to do agentic coding. It’s been useful. It’s been frustrating. On the whole, I can work faster, and I wouldn’t want to be without it. And yet, sometimes, I still code entirely by hand.
I’ve had many thoughts about what works, what doesn’t and some of the second-order effects of working with AI. So I wrote them down to share.
Let’s get started…
Sometimes, the effort is the point
This is not a novel thought on AI, but it’s worth re-stating…
If you need a fast and easy way to travel 26.2 miles, then take a train or car.
But, if the reason to travel 26.2 miles is to improve your fitness and enjoy (!) yourself, then you’re going to have to get on your feet. In a marathon, the effort is the point. It changes you.
And so it goes for AI. If you want to learn a new programming language, or learn about the problem you’re trying to solve, then having an agent do all the work for you is not helpful.
It’s too easy, too distant. Sure, it could help with the details in the same way as fancy watches and carbon-sprung shoes help a runner. But, when the effort is the point, you need to do the running yourself.
Sometimes the journey doesn’t matter. Sometimes the journey matters more than the destination. AI for the former, not-AI for the latter.
Which rather leads to…
Friction is a signal
When do you re-design or re-architect a codebase? When the friction of using it or extending it becomes too great.
We try to architect things the right amount for the expected use. Doing more is self-indulgent. Doing less is careless. And friction is a signal that tells us when we need to re-architect.
But AI can smooth over that friction. You want to bolt on a new function? Claude can do the PR for you! And if you LGTM that PR straight into the main branch, you may not notice any friction.
Eventually, though, you or your colleagues will notice the pain via an accumulation of unexamined code. But, by then, the mass of individual friction signals will be hard to tease apart into useful information about how to get back to a good state.
Watching an agent while it works, and critically reviewing the code it produces (including asking: is this the right-sized change for the function it delivers?), helps us to stay in touch with the friction.
Use the tools actively
AI is like an oracle in the classical sense. What do I mean by that? Well, let’s look at an interaction with an oracle:
In 560 BC Croesus, king of Lydia consulted the oracle of Delphi before attacking Persia. He was advised: “If you cross the river, a great empire will be destroyed”. Believing the response favourable, Croesus attacked, but it was his own empire that ultimately was destroyed by the Persians.
Huh? Uncritically accepting the answer that seems to confirm your prior beliefs, leads to a sticky end? Hmm… We’ve seen that in Macbeth, and we’ve absolutely seen that in AI-related failures too.
But AI is not an oracle. It’s a tool. You can choose when and how to wield it. You can craft an environment around your usage to make it work better for you. You don’t have to take its first answer.
During the research phase of a project, AI’s great to get you into the right jargon and onto the right topic, but you can and should go read the original sources after that. If hallucination fools you twice and you didn’t read the sources – that’s on you.
When you ask AI how to do X, and it says you should use method A, you can ask it for alternatives. It will give you method B.
You can ask it if A or B are similar to some other thing that you know more about. You can take a step back and explain your goal that lead you to ask about X.
The more you think at this stage, the better chance you’ll have to coming to a good answer.
Don’t just passively take the first answer or, like an oracle, it will trip you up.
AI doesn’t mind if you interrupt it
I’m terribly British about not wanting to interrupt. If I’ve asked Claude to do something non-trivial, I’ll often watch its progress. And sometimes, I’ll have to overcome my politeness and hit the stop button. I may see it going down a blind alley where the best thing to do is just stop, re-orient, and get it to go again (or write the code myself).
Watching the agent while it works makes you better prepared for reviewing the eventual solution. It gives you the chance to chip in with details and correct false assumptions. In my experience, AI tends to write overly defensive code – to the extent that the main logic can get lost. So, interrupting to say: “We’ve already sanitised this elsewhere, you don’t need to check for that” can be useful.
Not only does it not mind you interrupting, it also doesn’t mind if you scrap all the progress and start the conversation again, armed with the knowledge you gained the first time. Sometimes the context of your work in progress can be dragging things the wrong way. And, although it feels rude, you can scrap all that and try again.
AI is not an intern, despite that being a common metaphor as these tools emerged. It doesn’t learn, you do. It takes the context and rules that you shape around it, and that’s how you work together with the machine.
You can ask it to update its own rules
The coding tools all have some sort of rules or metadata files that set the way you want them to work. So, if the code you get back isn’t the way you want it, you get them to update the rules in your project (or user environment) to stop that happening again.
For example, they write too many comments (and often include numbered steps in the comments), I add rules to stop them from doing this.
I can mandate that it should prefer composition over inheritance in OO code, and I’ll get better first attempts from the AI.
It’s a powerful way to slice through the initial annoyances when you try these tools.
You can even make use of the plagiarism they were trained on them and tell them to write code or text in the style of a book you would like them to adhere to.
Bad code no longer stinks so badly
In the old days, your first clue about the quality of code was how it looked. Is it properly indented? Does it use reasonable looking function names? Are there tests? Unfortunately, AI nails those superficial aspects, even when it may have done dreadful things in the more abstract layer.
This makes it much more difficult to spot bad code. It requires constant vigilance, which can be pretty exhausting.
Two good prongs to use to attack this are:
- Setting the expectation that developers ought to be reviewing code thoroughly before raising a PR.
- Using AI to review code before your normal human reviews.
Although using AI to review AI seems ridiculous, you need to remember: it’s not human. It’s quite happy to criticise its own code.
Sometimes, you are the problem
If you have ever assigned tasks to people and found that they didn’t do what you had in mind, it’s important to ask yourself, “Did I even explain what I had in mind?”. People are not mind-readers, and neither are AI agents.
So, if the AI doesn’t do what you wanted, ask yourself (and you can also ask the AI) “What can I do differently next time?”.
Do you need to update the rules for the agent? (See above)
Do you need to take smaller steps? If you ask too much, the agent can get confused
Don’t just grab a screen capture and joke about how dumb AI is on social media, ask yourself first – am I doing this wrong?
Half-assed code can be a good prompt
One of the arguments against AI in general is that natural language isn’t a great way to precisely describe what a system should do. We have programming languages for that. It’s a so-so argument, usually offered by people who haven’t really tried these tools.
If you are a developer using AI, don’t forget that you’re a developer. Sometimes sketching out a not-remotely functional implementation in code can be the best prompt.
For example, one time I had a working solution from the AI, but it had done things by duct-taping together arrays and searching them many times to get anything done.
I could see that a recursive tree-based approach would be better.
But after a couple of attempts to explain my idea in a chat, I just gave up and started to write some code for my approach.
My code wasn’t even half-assed, it was probably quarter-assed. Definitely not functional. Just the structure of what I had in mind. And suddenly, the AI could implement it properly.
This was still much faster than writing the whole thing by hand, it just took some thought and perseverence beyond accepting the first working version.
If crap code is nearly-free, when should we use it?
AI makes it super-cheap to write code. Maybe not always good code, but cheap. And NOT ALL CODE HAS TO BE GOOD. Cheap code is a new phenomenon, and we should look for opportunities to make the most of it.
For example:
You can automate every part of your dev and test setup. Low-risk scripts that are owned by developers are easy to knock out with AI. What might have taken a few hours of remembering the intricacies of bash is now just moments away. So go script all that stuff immediately. Almost any sequence of steps you take can be scripted, even if you think you’ll only do it once, you can keep the script in case you ever need to do it again.
If you’re working with a hard-to-predict system (e.g. Salesforce Lightning, or humans) you can throw up prototypes to A/B (and C/D/E/…) test scenarios in a way that would never be economic before. AI’s not building the whole system here, just prototypes that you can throw away and build more carefully using the winning approach.
A technique I often use when debugging is to make a simpler version of the broken thing, where the simpler thing works. And then I would either modify the clean version, or the broken version, making one more like the other until I find the point where things move from working to broken. AI can massively speed this up by generating many points on that scale from simple + working to broken. We can try them all of much more quickly find where the failure lies.
You can build diagnostic tools. While I was struggling with connectivity in an Azure virtual network, AI built a tool to check the network stack step-by-step, isolating the fault. Which meant that each time I made a potential fix, I could easily re-run the tool. Once the configuration issue was resolved, I could ditch this free code and rely on integration tests.
Everyone becomes a reviewer of code
Reviewing code changes is hard. Incredibly hard. It takes experience, and knowledge of the system to be able to see the consequences of a change. It takes knowledge of the business and tech environments to consider alternative approaches. It’s something that’s normally reserved for the more experienced engineers.
But AI makes everyone a reviewer. The most junior developer working with agentic AI is now expected to review the output of the AI. Maybe this will turn out to be a good thing, shifting the emphasis from getting the curly braces in the right place to being able to engage in systems thinking. But it takes explicit recognition of that change to avoid setting up new developers for failure.
Beware the false promise of 100s of tests and 100% coverage
You can very easily do development-driven testing where you (or the AI) write the implementation first. Then you feel like there ought to be some tests. You prompt the AI “Write some tests for X”. Tests are created, they all pass and you have 100% code coverage.
But like other code, this test code is useless if it’s unexamined. Problems I’ve found in AI-generated tests include:
- Missing entire classes of behaviour
- Using so much mocking that you could remove chunks of implementation and all tests still pass
- Attempting to reuse the same state for all tests and reset it every time so that all tests passed one-by-one, but failed as a test suite
- Underfitting by asserting too loosely (e.g. the API returned a response, but not checking what’s in the repsonse)
- Overfitting by asserting sequences of method calls that should be private to the implementation
If you have 100 AI generated tests that make no sense, you can ask it to consolidate the tests so that they don’t overlap so much. And then examine the results.
AIs work really well with a good test suite. But they also offer a very tempting path to a large but problematic test suite.
Summing up
What does this all mean? Largely, that the same old rules of coding still apply with AI – only faster. We have some new capabilities, and some new tools to learn. AI coding is neither a silver bullet nor a plague on our profession. It’s another layer, another tool that we could hurt ourselves with or benefit from.
Take it from someone who splits wood while wearing flip-flops, learning to use your tools correctly really matters.














