#95 - From Platform Theater to Golden Guardrails with Steve Wade

March 23, 2026·45 min

Episode Description from the Publisher

Is your Kubernetes stack bloated, slow, and hard to explain? Steve Wade shares simple checks—the hiring treadmill, onboarding time, and the acronym test—to spot platform theater fast. What would a 30-day deletion sprint cut, save, and secure? We are always happy to answer any questions, hear suggestions for new episodes, or hear from you, our listeners. DevSecOps Talks podcast LinkedIn page DevSecOps Talks podcast website DevSecOps Talks podcast YouTube channel Summary In this episode of DevSecOps Talks, Mattias and Paulina speak with Steve Wade, founder of Platform Fix, about why so many Kubernetes and platform initiatives become overcomplicated, expensive, and painful for developers. Steve has helped simplify over 50 cloud-native platforms and estimates he has removed around $100 million in complexity waste. The conversation covers how to spot a bloated platform, why "free" tools are never really free, how to systematically delete what you don't need, and why the best platform engineering is often about subtraction rather than addition. Key Topics Steve's Background: From Complexity Creator to Strategic Deleter Steve introduces himself as the founder of Platform Fix — the person companies call when their Kubernetes migration is 18 months in, millions over budget, and their best engineers are leaving. He has done this over 50 times, and he is candid about why it matters so much to him: he used to be this problem. Years ago, Steve led a migration that was supposed to take six months. Eighteen months later, the team had 70 microservices, three service meshes (they kept starting new ones without finishing the old), and monitoring tools that needed their own monitoring. Two senior engineers quit. The VP of Engineering gave Steve 90 days or the team would be replaced. Those 90 days changed everything. The team deleted roughly 50 of the 70 services, ripped out all the service meshes, and cut deployment time from three weeks of chaos to three days, consistently. Six months later, one of the engineers who had left came back. That experience became the foundation for Platform Fix. As Steve puts it: "While everyone's collecting cloud native tools like Pokemon cards, I'm trying to help teams figure out which ones to throw away and which ones to keep." Why Platform Complexity Happens Steve explains that organizations fall into a complexity trap by continuously adding tools without questioning whether they are actually needed. He describes walking into companies where the platform team spends 65–70% of their time explaining their own platform to the people using it. His verdict: "That's not a team, that's a help desk with infrastructure access." People inside the complexity normalize it. They cannot see the problem because they have been living in it for months or years. Steve identifies several drivers: conference-fueled recency bias (someone sees a shiny tool at KubeCon and adopts it without evaluating the need), resume-driven architecture (engineers choosing tools to pad their CVs), and a culture where everyone is trained to add but nobody asks "what if we remove something instead?" He illustrates the resume-driven pattern with a story from a 200-person fintech. A senior hire — "Mark" — proposed a full stack: Kubernetes, Istio, Argo, Crossplane, Backstage, Vault, Prometheus, Loki, Tempo, and more. The CTO approved it because "Spotify uses it, so it must be best practice." Eighteen months and $2.3 million later, six engineers were needed just to keep it running, developers waited weeks to deploy, and Mark left — with "led Kubernetes migration" on his CV. When Steve asked what Istio was actually solving, nobody could answer. It was costing around $250,000 to run, for a problem that could have been fixed with network policies. He also highlights a telling sign: he asked three people in the same company how many Kubernetes clusters they needed and got three completely different answers. "That's not a technical disagreement. That's a sign that nobody's aligned on what the platform is actually for." The AI Layer: Tool Fatigue Gets Worse Paulina observes that the same tool-sprawl pattern is now being repeated with AI tooling — an additional layer of fatigue on top of what already exists in the cloud-native space. Steve agrees and adds three dimensions to the AI complexity problem: choosing which LLM to use, learning how to write effective prompts, and figuring out who is accountable when AI-written code does not work as expected. Mattias notes that AI also enables anyone to build custom tools for their specific needs, which further expands the toolbox and potential for sprawl. How Leaders Can Spot a Bloated Platform One of the most practical segments is Steve

AI Summary coming soon

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.