Episode Summary
What are the current techniques being employed to improve the performance of LLM-based systems? How is the industry shifting from post-training towards context engineering and multi-agent orchestration? This week on the show, Jodie Burchell, data scientist and Python Advocacy Team Lead at JetBrains, returns to discuss the current AI coding landscape.
In our last conversation, Jodie covered how LLMs were approaching the limits of scaling laws. This time, we recap last year’s big focus on reasoning models and a post-training method called “reinforcement learning from verifiable rewards” (RLVR). We also cover test-time compute, where models spend more time reasoning through steps and considering multiple approaches to solve a problem.
We touch on Agent Context Protocol (ACP), agent orchestration layers, and context engineering. We also share some concerns about the hype cycle, maintaining all that code being generated, and running local models.
Course Spotlight: Vector Databases and Embeddings With ChromaDB
Learn how to use ChromaDB, an open-source vector database, to store embeddings and give context to large language models in Python.
Topics:
00:00:00 – Introduction
00:02:02 – Build a Language-Learning Agent course
00:02:55 – Update on the past six months of LLMs
00:05:32 – Reinforcement Learning From Verifiable Rewards
00:07:32 – Test Time Compute
00:08:36 – 2025 and the rise of agents
00:14:24 – Benchmarks shifting
00:15:23 – Andrew Karpathy and jagged intelligence
00:19:16 – Not evolving or growing animals but summoning ghosts
00:23:34 – Diminishing gains in newer models
00:24:23 – Context Engineering
00:35:01 – Multi-agent systems and diversity of models
00:36:56 – Video Course Spotlight
00:38:34 – Current generation of coding agents
00:44:00 – Fast vs deep reasoning
00:45:18 – Agent Context Protocol
00:50:19 – Working through the hype cycle
00:55:43 – Open-source contribution pollution
00:57:21 – Local models
00:58:36 – Rick Beato comparing how the music industry failed
01:08:41 – LLMs are an amazing development
01:11:33 – Keynote talk on AI summers and winters
01:12:45 – PyCon US and EuroPython
01:14:11 – Thanks and goodbye
Show Links:
AI Agent Course - Build a Language‑Learning Agent with OpenAI, LangGraph, Ollama & MCP - YouTube
Episode #264: Large Language Models on the Edge of the Scaling Laws
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs
Reinforcement learning with verifiable rewards (RLVR)
What is test-time compute and how to scale it?
Overfitting - Wikipedia
2025 LLM Year in Review - karpathy
Animals vs Ghosts - karpathy
Agent Context Protocols Enhance Collective Inference
