MLOps.community

Agents are Just While Loops

May 15, 2026·41 min
Episode Description from the Publisher

Hamza Tahir, co-founder of ZenML, joins the show to cut through the hype around long-running agents — arguing that at the end of the day, an agent is just a while loop that talks to a model, calls a tool, and writes to a file system. He covers the architecture of agent harnesses (inner and outer), what durable execution actually guarantees (and what it doesn't), and why the ML pipeline paradigm is a cleaner mental model than transactions for most agent workloads.Hamza also announces Kitaru — ZenML's new open-source execution runtime for async Python agents — built on five years of running ML workloads in enterprise environments.What we get into:Agents are while loops: The surprising simplicity under all the tooling: a brain (LLM), hands (tool calls), and a file system, stacked recursivelyInner harness vs outer harness: Why Pydantic AI owns the inner loop while production deployment needs a separate runtime layerWhat "long-running" actually means: Why the infrastructure we need to build is about extrapolating the future, not defining a time window todayDurable execution demystified: What checkpointing actually guarantees (infra failures, pod death, network drops) vs. what it never will (external state, bad LLM outputs, Snowflake rollbacks)ML pipelines vs transactions: Why bursty containers in Kubernetes map more naturally to agent workloads than microsecond-latency queue workers — and why Hamza argues against the complexity taxAnthropic opening the harness: Why letting other models run Claude Cowork is a "boss move," and what it means for the one-harness vs one-model debateHuman-in-the-loop, done right: The pod-kill-and-resume pattern, and why warm pools matter less when your agent runs for daysKitaru: ZenML's new open source durable execution runtime: zero-config local, Kubernetes/SageMaker/Vertex in production, built on Pydantic AI integrationArguing with Claude about Temporal: Hamza's story of spending hours getting an LLM to admit ZenML and Temporal solves the same problemIf you're architecting agents for production, picking between Pydantic AI, LangGraph, and Temporal, or just want to understand what "durable execution" actually means — this is the episode.// LINKS & RESOURCESKitaru on GitHub: https://github.com/zenml-io/kitaruKitaru launch blog post: https://www.zenml.io/blog/kitaru-launchKitaru on Hacker News: https://news.ycombinator.com/item?id=47520115Hamza Tahir on LinkedIn: https://www.linkedin.com/in/hamzatahirofficial/ZenML: https://www.zenml.io/ Timestamps[00:00] While Loop Checkpointing[00:24] Long-Running Agents Explained[01:28] Agent Harness Model Definitions[06:30] Durability and State Recovery[11:03] Agent Systems Layers[18:45] Durability in Agent Systems[22:07] ML Pipeline vs Transactions[29:23] Durability vs Guarantees[33:13] Durability vs Chaos Engineering[39:50] Kitaru Naming and Purpose[40:38] Wrap up#AIAgents #DurableExecution #OpenSource

Podzilla Summary coming soon

Sign up to get notified when the full AI-powered summary is ready.

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.

Listen to This Episode

Get summaries like this every morning.

Free AI-powered recaps of MLOps.community and your other favorite podcasts, delivered to your inbox.

Get Free Summaries →

Free forever for up to 3 podcasts. No credit card required.