← All writing

Infrastructure · 6 min read

The five abilities

Most of my philosophy is about building teams for product. Infrastructure sits a little outside that, so let me lean in on how I think about it.

When it comes to infrastructure, I build teams that align to three cornerstones: developer productivity, cost effectiveness, and what I call the five abilities. Together they keep platform engineering pragmatic and execution-focused, rather than a pursuit of technical ambition for its own sake.

Three cornerstones

Developer productivity

Faster iteration, less friction. Invest in tooling and automation that reduce cognitive load, make releases effortless, and remove the roadblocks that slow engineers down.

Cost effectiveness

Infrastructure should scale with business needs, not just technical ambition. Right-size cloud spend, avoid waste without sacrificing reliability, and make ROI-driven decisions.

The five abilities

The foundations that decide whether a platform can be trusted and can grow. Five of them, and they’re the rest of this piece.

The five abilities

Define success this way and engineering stops being a cost center. Productivity and cost are directly measurable, so infrastructure becomes a business function. And the principles hold whether you’re a startup or an enterprise.

Beyond Google’s SRE

People assume “SRE” means the Google book, but that book is just one take, an idealised, Google-scale, Google-specific model. My approach overlaps with it and then departs in a few ways that matter in the real world.

It’s built for constraints, not just scale. Google assumes you’re scaling to billions of users with near-infinite resources. Most organizations are juggling budget, staffing, and legacy systems instead, so I optimise for practical trade-offs, not ideals. Error budgets don’t solve everything when you’re short-staffed and the business has priorities.

It’s broader than reliability and availability. I treat maintainability, observability, and scalability as first-class concerns, not afterthoughts. It’s vendor-agnostic, AWS, hybrid clouds, whatever fits, rather than assuming one internal toolchain. And it’s experience-driven: I was running platforms before this discipline was called SRE, often in places that never had a dedicated SRE team and relied on engineers wearing several hats.

Most of all, it’s business-aligned. Google framed SRE as a technical discipline; I treat it as a business function, weighing cost, team bandwidth, and long-term sustainability in every call, managed services versus self-hosted, automating the toil that actually hurts rather than chasing a perfect automation model.

Google wrote the book. I’ve lived the trade-offs they theorised about, and that’s why my approach goes a little beyond their definitions.

Scaling your platform?

If you want a pragmatic partner for infrastructure and reliability, let’s talk.

Start a conversation