The Compounding Agent

Digital Thoughts: AI From the Trenches

0:00

-25:37

The Compounding Agent

Reading a leaked Claude Code source, swapping a 35B model's brain for a 4.4x speedup, and writing the beginner's guide I wish I had six months ago.

Pawel Jozefiak

Apr 11, 2026

Episode four. What happens when hobbyist AI starts growing up into production AI, and how the lessons compound if you pay attention.

First, a rare look inside the pros’ toolbox. Claude Code’s source got leaked. Instead of treating it like drama, I treated it like a free masterclass. Tool permission gating, risk classification, blocking budgets, memory management, multi-agent coordination, feature flags like autoDream and KAIROS. Most people building agents today are reinventing patterns that professional teams already solved. You learn more from reading one real production codebase than from ten tutorial posts.

Then, applying those lessons to my own stack. My $599 Mac Mini M4 runs a 35 billion parameter model at 17.3 tokens per second. That alone is surprising. Then I swapped the brain of the classification tier to Gemma 4, and classification went from 8.5 seconds down to 1.9 seconds. A 4.4x speedup. I also disabled chain-of-thought on simple classification calls and got 30x faster results with identical accuracy. Production AI isn’t one giant model doing everything. It’s the right model for the right job, and most jobs don’t need the biggest one.

Finally, handing the wisdom forward. After six months of running this thing daily, I wrote a beginner’s guide to building your first agent. Folder structure is the architecture. The nine common mistakes people make early. Model routing across Haiku, Sonnet, and Opus tiers. Progressive permissions. The context window trap. Overnight automation is where the real leverage lives. Not a hype piece. A map for the person walking in the door behind me.

The thread: compounding expertise. Study how the pros build. Optimize your own stack with those patterns. Teach the next person who walks in. The gap between hobbyist AI and production AI is closing, and the fastest way to cross it is learning from real systems instead of tutorials.

Posts discussed in this episode:

- Claude Code’s Source Got Leaked. Here’s What’s Actually Worth Learning (https://thoughts.jock.pl/p/claude-code-source-leak-what-to-learn-ai-agents-2026)

- My $600 Mac Mini Runs a 35B AI Model. Yesterday I Swapped Its Brain (https://thoughts.jock.pl/p/local-llm-35b-mac-mini-gemma-swap-production-2026)

- How to Build Your First AI Agent (Basics) (https://thoughts.jock.pl/p/how-to-build-your-first-ai-agent-beginners-guide-2026)

Digital Thoughts

The Compounding Agent

Discussion about this episode

Ready for more?