Run Local LLMs on Mac to Cut Claude Costs
A practical hybrid workflow that uses costly LLM APIs for planning and local models (via Ollama + OpenCode) for execution, guarded by deterministic evals.
Browse 25 posts in this category
A practical hybrid workflow that uses costly LLM APIs for planning and local models (via Ollama + OpenCode) for execution, guarded by deterministic evals.
We recorded Warp traffic to see what gets sent back to the home base. Spoiler: It's everything.
AI coding adoption is high and trust is dropping. A testing pyramid for agents, plus reproducible production context that grounds AI in real behavior.
SaaS AI fails when agents need continuous access to your codebase and internal APIs. Here's why BYOC is the only deployment model that works at scale.
LLMs have collapsed the cost of custom internal tools. Here's the startup distribution problem I've watched kill companies — and how I vibe-coded my way out.
Production AI spend gets attention. Non-prod LLM calls in development, CI, and load tests often do not. Simulation fixes that.
AI-generated code is moving fast—but without behavioral validation, you're gambling with production stability. See how Proxymock changes the equation.
Fast mode or deep mode? Haiku or Opus? Cursor or Claude Code? The decision fatigue from AI coding tools is killing the productivity they promised.
How we built an AI agent that implements Jira tickets, creates merge requests, and monitors them autonomously—and the iterative journey to get there.