
I’ve run this stuff in production. I know where it breaks.
I’ve been working in the cloud native ecosystem for a long time — as a contributor, not just a user. Kubernetes releases, Docker, Tinkerbell at Equinix Metal. Enough production incidents to know where things quietly go wrong.
If your platform is scaling past what your current setup was designed for, or your Kubernetes migration left you with something nobody fully understands, let’s talk.
What I work on
Kubernetes — at depth. Not setup and onboarding. The hard parts: multi-cluster architecture, workload reliability, resource management, upgrade strategies, security posture. I contributed to Kubernetes releases. I know what changed between versions and why it matters for what you’re running.
Site Reliability Engineering. Error budgets, SLOs, SLIs, on-call culture, incident response, post-mortems that actually change behavior. SRE isn’t a job title you sprinkle on your team — it’s an operational discipline. I help teams build it properly, or diagnose why the version they have isn’t working.
Observability. Metrics, tracing, structured logging — the full picture. Not dashboards for dashboards’ sake, but instrumentation that tells you something useful when things go wrong at 2am. I’ve built this layer from scratch and I’ve inherited the broken versions. Both experiences matter.
Cloud infrastructure and automation. AWS, GCP, bare metal, hybrid. CI/CD pipelines, infrastructure as code, developer platforms that reduce the surface your team has to operate manually. The goal is always: less toil, more signal.
Distributed systems. Services that span nodes, availability zones, and failure domains. Contributor to Tinkerbell at Equinix Metal — bare metal provisioning at scale is about as distributed as it gets. I understand what breaks at the network layer, the storage layer, and the coordination layer.
How we work together
01 —
Infrastructure Review · Entry point
I look at what you have and tell you what I actually see.
Not a generated report. Not a checklist. I go through your platform — architecture, Kubernetes setup, observability, reliability practices, automation — and give you a direct assessment of what’s working, what’s fragile, and where the risk is. Fixed scope, fixed fee. Standalone if that’s all you need.
02 —
Hands-on engagement
I work alongside your team until the problem is solved.
Embedded, hands-on. I don’t hand over a document and disappear. I stay in the work with your engineers — writing code, reviewing PRs, making architecture decisions together — until the situation is resolved and your team owns the outcome.
03 —
Ongoing advisory
A senior technical voice as your platform grows.
Architecture calls, technology choices, scaling decisions. A standing engagement for teams that want experienced judgment available without a full-time senior hire.
Who this is for
Teams with real infrastructure already running in production. You have engineers who understand the system, a specific set of problems that have outlasted your internal attempts to fix them, and no interest in being sold a platform.
You want someone who has operated at this level before — not as a consultant who reads your docs, but as an engineer who has shipped this kind of work.
Start with the Infrastructure Review.
Tell me what you’re running, what’s giving you trouble, and what you’ve already tried.
info@shippingbytes.com · subject: "Cloud Native Advisory — [one line about your stack]"
We respond personally. No forms, no account managers.