Semantic layer, lineage, and rich metadata propagation
Episode Summary
Summary In this episode Michael Toy, co-creator of Malloy, talks about rethinking how we work with data beyond SQL. Michael shares the origins of Malloy from his and Lloyd Tabb’s experience at Looker, why SQL’s mental model often fights human problem solving, and how Malloy aims to be a composable, maintainable language that treats SQL as the assembly layer rather than something humans should write. He explores Malloy’s core ideas — semantic modeling tightly coupled with a query language, hierarchical data as the default mental model, and preserving context so analysis stays interactive and open-ended. He also digs into the developer experience and ecosystem: Malloy’s TypeScript implementation, VS Code integration, CLI, emerging notebook support, and how Malloy can sit alongside or replace parts of existing transformation workflows. Michael discusses practical trade-offs in language design, the surprising fit for LLM-generated queries, and near-term roadmap areas like dimensional filtering, better aggregation strategies across levels, and closing gaps that still require escaping to SQL. He closes with an invitation to contribute to the open-source project and help shape its evolution. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.You’re a developer who wants to innovate—instead, you’re stuck fixing bottleneck
