Fundamentals First, AI Second

We're building with AI, and building AI into our products. But have we forgotten the fundamentals that make software actually work?

Originally posted on Substack

The tech world is buzzing with AI. Every startup pitch deck mentions “AI-powered this” or “machine learning that.” We’re using AI to write our code, generate our tests, and even architect our systems. Meanwhile, we’re embedding AI capabilities into nearly every product we build.

But here’s the thing: in all this excitement about artificial intelligence, we seem to have forgotten about human intelligence — specifically, the hard-won wisdom about building software that actually works.

I’ve been thinking a lot about Kent Beck’s Extreme Programming Explained. Published over two decades ago, it outlined practices that felt radical then but are now considered fundamental: user stories, test-driven development, continuous integration, pair programming, and iterative development. These weren’t just nice-to-haves; they were responses to the chaos of building complex software systems.

Today, as we navigate this AI era, that chaos has multiplied exponentially. We’re not just building software; we’re building software that incorporates unpredictable AI models, training systems on vast datasets, and creating products where the behaviour can’t always be deterministically predicted.

Which makes those XP fundamentals not just relevant; they’re more critical than ever.

The New Chaos

Consider what we’re dealing with now:

AI models that evolve: Your carefully crafted prompts might work differently tomorrow as the underlying model gets updated
Non-deterministic behaviour: The same input might produce different outputs, making traditional testing approaches insufficient
Data dependency: Your AI features are only as good as your training data, which can drift or become stale
Emergent behaviours: Complex AI systems can exhibit unexpected interactions that no single developer anticipated

This isn’t your typical software complexity. This is complexity with uncertainty baked in.

Why User Stories (Specifications) Are Your Lifeline

In traditional software, we could get away with vague requirements. “Make it faster” or “improve the UX” might have been sufficient direction. With AI systems, that approach is a recipe for disaster.

AI models need explicit, measurable criteria to be useful. Your user stories need to capture not just what the system should do, but how it should behave when it’s uncertain, how it should handle edge cases, and what “good enough” looks like.

“As a user, I want the AI to categorise my expenses” is useless. “As a user, I want the AI to categorise my expenses with at least 85% accuracy on my personal spending patterns, gracefully handle ambiguous transactions by asking for clarification, and learn from my corrections over time” gives you something to build and test against.

The discipline of writing precise specifications forces you to think through the inherent uncertainty in AI systems before you build them.

Testing in an Unpredictable World

Traditional unit tests assume deterministic behaviour. Call a function with specific inputs, expect specific outputs. AI systems laugh at this assumption.

But this doesn’t mean testing is impossible — it means we need to evolve our approach. We need tests that verify behaviour within acceptable ranges, tests that check for catastrophic failures, and tests that validate the system’s uncertainty handling.

Automated testing becomes even more crucial when you can’t manually verify every possible AI interaction. You need comprehensive test suites that can catch when your AI starts behaving unexpectedly, regression tests that ensure model updates don’t break existing functionality, and integration tests that verify the entire AI pipeline.

The XP principle of “test everything that could possibly break” is now “test everything that could possibly behave differently.”

Continuous Everything: CI/CD/CE

Extreme Programming emphasised continuous integration: constantly merging code to catch integration issues early. In the AI era, this expands to continuous deployment and continuous evaluation.

Your AI models need continuous monitoring. Their performance can degrade silently as real-world data drifts from training data. You need automated systems that constantly evaluate your AI’s performance in production and alert you when something’s off.

This is CI/CD/CE: Continuous Integration, Continuous Deployment, Continuous Evaluation. That third pillar — continuous evaluation — is what keeps your AI systems honest in production.

Iterative Development as Risk Management

XP’s emphasis on short iterations and frequent releases becomes risk management in AI development. You can’t predict how an AI feature will behave in the real world until real users interact with it.

Small, frequent releases let you gather feedback quickly, adjust your approach, and catch problems before they scale. This is especially important with AI, where the feedback loop between development and real-world performance can teach you things no amount of offline testing could predict.

On the side: Pair Programming with AI

Pair programming used to mean two humans at one computer. Now, many of us are “pairing” with AI coding assistants daily.

This changes the dynamic completely. Your AI pair never gets tired, never gets frustrated, and has access to vast amounts of code. But it also doesn’t understand your business context, can’t reason about edge cases, and might confidently suggest completely wrong approaches.

The key is treating AI as the junior developer in the pair — good at generating boilerplate and exploring possibilities, but requiring constant review and guidance from the human senior developer. The human stays in the driver’s seat, making architectural decisions and ensuring the code serves the actual business need.

This partnership can be incredibly productive, but only if we maintain the discipline to review, test, and validate everything the AI suggests.

Why This Matters Now

We’re at an inflection point. The companies that will succeed in this AI era aren’t necessarily those with the most sophisticated models or the largest datasets. They’re the ones that can reliably ship AI-powered features that actually solve real problems.

That reliability comes from applying solid engineering practices to inherently unreliable systems. It comes from the discipline to specify, test, and iterate even when (especially when) the technology feels magical.

The fundamentals haven’t become obsolete; they’ve become more important. Because in a world where the technology itself is unpredictable, our processes need to be rock solid.

Kent Beck’s insights about managing complexity in software development are more relevant today than ever. We’re just applying them to a new kind of complexity — one that includes artificial intelligence.

The question isn’t whether AI will transform how we build software. It already has. The question is whether we’ll remember the lessons that help us build software that actually works.

This is the first in a series exploring how fundamental engineering practices apply to AI-era development. Coming up: deep dives into specification techniques for AI systems, testing strategies for non-deterministic behaviour, and building effective CI/CD/CE pipelines.