
In this episode James and Frank dive into running AI coding models locally versus in the cloud—BYOK/Open Router, VS Code’s chat/agent harness, model runners (Olama, vLLM), and the practicality of 27B models on a 3090 using 4‑bit quantization. They share hands-on takeaways—how recent engineering (MT/MTPLX) boosts inference to usable token rates, when auto model selection makes sense, cost and hardware trade‑offs, and why local models can liberate your workflow while still needing smarter, unified tooling. Follow Us Frank: Twitter, Blog, GitHub James: Twitter, Blog, GitHub Merge Conflict: Twitter, Facebook, Website, Chat on Discord Music : Amethyst Seer - Citrine by Adventureface ⭐⭐ Review Us ⭐⭐ Machine transcription available on http://mergeconflict.fm
Podzilla Summary coming soon
Sign up to get notified when the full AI-powered summary is ready.
Free forever for up to 3 podcasts. No credit card required.

515: Mini-LED, IPS, and the Great Monitor Rabbit Hole

513: Agents Over Chat: The Future of Developer Workflows

512: Does Matter Really Matter?

511: Terminals, Remote Sessions, No More Watches!
Free AI-powered recaps of Merge Conflict and your other favorite podcasts, delivered to your inbox.
Free forever for up to 3 podcasts. No credit card required.