capability – Codemanship's Blog

Essential Code Craft – The Roadmap

Some of you may have noticed that I’ve been running out-of-hours training workshops for self-funding learners recently, under the banner of Essential Code Craft.

In a way, this is a return to the early days of Codemanship when I ran regular weekend workshops – priced for individual pockets – that were mostly attended by developers investing in their own skills and career development.

Many of those people are now CTOs and heads of engineering, and I’ve been fortunate – and grateful – that quite a few have brought me in to provide the same kind of training for their teams.

But with senior engineering leaders now very distracted by the code-generating firehose – and while I wait for them to realise that nothing’s actually changed as far as software engineering fundamentals are concerned – I’m pivoting back to self-funders.

So far – just as it was way back when – the first two workshops filled up quickly. While the boss might not be thinking about investing in their developers at the moment, it seems a lot of developers are looking to invest in themselves.

And this is exactly the moment to do it. While a gazillion developers hunt for magic incantations to make a probabilistic next-token predictor act like something other than a probabilistic next-token predictor, the people who’ve done their homework already know: better results with AI coding tools have very little to do with the tools, and almost everything to do with the processes around them.

And it’s a double-win. The practices that produce the best outcomes with AI are the exact same practices that produce the best outcomes without AI.

The key to being effective with AI is being effective without it.

And here’s the hedge, but only for the informed gamblers – developer hiring is rising again, but the demographic of these new hires is changing. Employers are favouring senior developers with significant pre-LLM experience.

I, and a few others, predicted this would happen. Demand would be highest for people who can do the things AI coding tools can’t – like, well, understand code. I mean really understand it. Not “LGTM” understanding. Deep comprehension of programs.

Not only that, but for all kinds of good reasons – economic, environmental, energy, ethical, geopolitical – the future of hyperscale LLMs is by no means predictable. Folks grappling with reduced token limits and rapidly degrading performance with Anthropic’s newest models will hopefully have figured out by now that building workflows that depend heavily in hyperscale LLMs is building on quicksand.

Who are Acme Megacorp gonna’ hire – the dev who sits on their hands because they’re waiting for their token limit to reset, or the dev who can just carry on at roughly the same overall pace of delivery?

And we should be under no illusions that teams who’ve mastered the fundamentals of software delivery are routinely outperforming teams who haven’t – with or without AI. AI is clearly not the differentiator.

So, whether you’re going to apply these disciplines with Claude Code or Codex, or with IntelliJ or VS Code, they still matter – arguably more than ever.

And what are these disciplines? What is Essential Code Craft?

Specification By Example – build shared understanding and pin down requirements with testable specifications
Test-Driven Development – rapidly iterate working software designs with short delivery lead times and reliable releases
Continuous Integration – keep teams more in sync with their changes, merging and testing them many times a day to ensure a working, shippable-at-any-time product
Continuous Collaboration – keep teams on the same page by continuously communicating with practices like pair programming and teaming
Refactoring – reshape code to make change easier, while keeping it working and shippable at all times
Modular Design – optimise software architecture to localise the “blast radius” and minimise the cost of changes, while making rapid testing and smarter reuse easier
Continuous Inspection – minimise the bottleneck and the “LGTM” effect of downstream code review by making it a continuous and highly automated process
Continuous Delivery – combine these fundamentals in a delivery process that can get the proverbial peas from the farmer’s field to the kitchen table through rapid, reliable integration, build and deployment pipelines
Continuous Improvement – build development capability in an evidence-based way, learning what really works and what doesn’t as you build skills, automate tools and workflows, and explore and experiment with your approach – and that’s where I come in!)

Workshops on Specification By Example and Test-Driven Development are already live and taking registrations. If there’s demand, more will follow.

The roadmap is to build a set of repeating individual workshops, rotating monthly, that will eventually cover all of these disciplines – some explicitly, some implicitly like Continuous Integration and pair programming, which will be an integral part of most workshops.

Self-funders can pick and choose which to attend, and my hope is that they’ll be a bit like Pokemon cards – gotta collect ’em all!

Keep an eye on the Codemanship Ticket Tailor box office for details of upcoming workshops.

Also, details of new workshop times will be posted here first, so subscribe to this blog if you’d like to be kept in the loop for future workshops.

The Mouse That Roared

_Psst. _{If your boss won’t invest in training you in Test-Driven Development, I’m running out-of-hours workshops on April 7 and 11 specifically for self-funding learners. £99 + UK VAT.}

Once upon a time, in a land far, far away (just north of London Bridge), there was a medium-sized financial services company with a problem. A big problem.

For 9 months, they had been trying to deploy a new release of their core platform, and every single attempt had exploded in their – and their customers’ – faces.

Teams of testers worked round the clock trying to find the bugs. Developers worked 12-hour days, 6 days a week trying to fix them. But the software just kept getting more and more broken. (As did the teams.)

Because, while the developers were fixing the bugs, the changes they were making were introducing new bugs – along with reintroducing some old favourites – that the testers weren’t finding until weeks later, if they found them at all. Users were 2x more likely to find a bug than the testers, and the stability of the platform was so bad that every release ended up being rolled back within a couple of days.

They were beached.

I’ve mentioned before that the DevOps Research & Assessment classifications of software delivery performance need a new level below “Poor”, which I called “Catastrophically Bad“. This is when every deployment fails, problems can’t be fixed – just reverted – and lead times are effectively infinite. Nothing’s getting delivered – well, nothing that sticks – and there’s no light at the end of that tunnel.

In desperation, an army of consultants and coaches was brought in who – at vast expense – set about “fixing” the process. Backlogs were groomed. Daily meetings were attended. Velocities were tracked. Estimates were tweaked. Test plans were optimised. Graphs and charts were prominently displayed.

And at the end of all that, the consultants and coaches confidently concluded, “Yep, you’re basically f***ed.” They exited the building with… well, let’s just say there wasn’t a leaving card. At least now they had the graphs to prove it. Money well spent, I’m sure we’d all agree.

That’s where I came in. I politely declined to engage with management on their “agile process” – as I have for many years now with all my clients.

Instead, I asked the developers to change one thing about how they worked. I took them into a meeting room, and showed them how to write NUnit tests.

In the next couple of weeks, we wrote some system-level smoke tests that they could run a few times a day just to provide some basic assurance that the thing wasn’t totally borked. Which, for a long time, it was.

Then I instructed them to write an automated “unit” test before any change they made to the code. Fixing a bug? Write a test for it first. Making a change to the logic? Write a test for it first. Refactoring the architecture? Write a test for it first. (And if you can’t, then refactor it with careful manual testing until you can, and then write a test for it.)

It took a couple-or-three months, but as the unit test coverage gradually increased – and in the areas of the code that were changing and breaking most often – the software stabilised enough to finally produce a release that stuck; a first foot forward in over a year.

The second foot forward – the next stable release – came a month after that, and a third a month after that. The platform – and the business built around it – was walking again.

This is where interesting things start to happen to the overall development process. With releases now happening fairly incident-free once a month instead of, well, never, the “agile process” adapted around that. Product managers were now thinking about next month more than they were thinking about next year. (Specifically, where they might be working next year.)

Again, I was invited to engage with them, and again I politely declined. I wasn’t done with this innermost feedback loop yet.

Tests covered about 40% of the code – basically, the code that had been changed in those 3 months – but that coverage was growing every day. By the time it got to around 80% six months later, release cycles had accelerated from monthly, still with a significant reliance on manual regression testing, to weekly, with a small amount of manual testing. And the backlog had been paid down to the point where lead times were in the order of a couple of weeks.

And while the test assurance grew, and the architecture became more modular to accommodate it – which significantly reduced the “blast radius” of changes and merge conflicts – lead times shrank even further.

And, again, the wider process adapted to this new reality, shrinking user feedback loops and becoming more willing to gamble and experiment, knowing that they were now placing much smaller bets.

The management process at this point looked dramatically different to how they were doing things when I arrived. And I hadn’t engaged with it at all, beyond recommending a couple of books about goals.

The organisation had evolved from plan-driven, big-bang releases (that always went bang), to iterative, goal-driven releases that enabled experimentation and learning.

And that all happened in response to shrinking lead times, enabled by accelerating release cycles, made possible by very fast, very frequent automated testing.

We didn’t need to ask for extra budget. We didn’t require software licenses. We didn’t need management to engage or give permission. We didn’t ask anybody outside of the dev teams to change anything. We just f***ing did it – this one little change to how the teams wrote code – and the entire organisation adapted around it.

In the proceeding years, that innermost feedback loop got tightened even more, with more investment in skills and automation. And, of course, the wider processes changed, as did the makeup of the teams, but not because we demanded they should. The delivery cycle accelerated, and the business adapted around that. They became more goal-oriented, more feedback-driven, and more experimental, and the organisation started to reflect that.

They now release changes multiple times a day, testing one change in the market at a time, and rapidly feeding back what they learn. Or, as you may know it, actual agility.

Sure, it took a few years for them to get there. But consultants and coaches had spent a year effecting no change at all at a very high cost. I spent a few days a month with them for a couple of years.

They made the mistake that almost all agile transformations make: they try to improve software delivery capability without actually engaging with it. You can’t plan or manage your way to daily releases.

You’re looking at the wrong feedback loop – the wrong cog in the clockwork. The big cogs aren’t driving the little cogs. The little cogs are driving the whole system.

A New DORA Performance Level – Catastrophically Bad

DORA (DevOps Research & Assessment) has 4 broad levels for dev team performance: Elite, High, Medium, and Low.

A Low-Performing team deploys less than once a month, has lead times for changes > 1 month, sees as many as half their deployments go boom, and takes more than a week to fix them.

An Elite team deploys changes multiple times a day, has lead times typically of < 1 day, failure rates < 15% (fewer than 1 in 8 deployments go boom), and can fix failed releases in under an hour.

I picture a Low-Performing team walking a tightrope between two mountain peaks. It’s a long way to safety (working, shippable code), a long way down if they fall, and a long climb back up to try again.

I picture an Elite team as walking the same length tightrope, tied to wooden posts a few feet apart, just 3 feet off the ground. Safety’s never far away, and a fall’s no big deal. They can quickly recover.

I would like to propose a 5th performance level, one that I’ve seen for real more than once: Catastrophically Bad.

At this level, the tightrope has snapped.

I’ve seen teams stuck in a death spiral where no changes can be deployed because every release goes boom. One example was a financial services company here in London who’d been running themselves ragged trying to stabilise a release for almost a year.

Every deployment had to be rolled back, and every deployment cost them high six-figures in client compensation, not to mention loss of reputation.

You know the drill: testing was done 100% manually and took weeks. While the devs were fixing the bugs testing found, they were introducing all-new bugs (and reintroducing a few old favourites).

A change failure rate of 100% and a lead time of – effectively – infinity.

How does a Catastrophically Bad development process that delivers nothing turn into at the very least a Low-Performing process that delivers something? (And yes, this is the mythical “hyper-productivity” the Scrum folks told you about – and no, Scrum isn’t the answer).

What we did with my client started with 2 fundamental changes:

Fast-running automated “smoke” tests, selected by analysing which features broke most often
CI traffic lights – a wrapper around version control that forced check-ins to go in single file, and blocked check-ins and check-outs of changes when the light wasn’t “green” (meaning, no check-in in progress and build is working).

It took 12 weeks to go from Catastrophically Bad to Medium-Performing. (Pro tip for new leaders and process improvement consultants – poorly-performing teams are a gift because they have such low-hanging fruit).

You can build from here. In this case, by showing teams how to safely change legacy code in ways that add more fast-running regression tests and gradually simplify and modularise the parts of the code that are changing the most.

(The title image is taken from a Catastrophically Bad agentic “team”)