Episode Summary

What are the advantages of using Polars for your Python data projects? When should you use the lazy or eager APIs, and what are the benefits of each? This week on the show, we speak with Jeroen Janssens and Thijs Nieuwdorp about their new book, Python Polars: The Definitive Guide. Jeroen and Thijs describe how they were introduced to Polars while working at Xomnia. They were converting a large data project to Python and saw surprising speed increases using the new library. We discuss converting projects from pandas to Polars, getting away from indexes, consistent syntax, and using lazy vs eager APIs. Along the way, Jeroen and Thijs offer tips for getting the most out of Polars in your code. We dig into the process of writing a definitive guide and the advantages of working collaboratively on a book project. They also share resources for practicing data wrangling and building visualizations with Pydy Tuesday. Course Spotlight: Working With Python Polars Welcome to the world of Polars, a powerful DataFrame library for Python. In this video course, you’ll get a hands-on introduction to Polars’ core features and see why this library is catching so much buzz. Topics: 00:00:00 – Introduction 00:02:47 – Polars start at Xomnia 00:04:08 – Putting Polars into production 00:07:18 – Realizing the speed differences 00:08:49 – Converting the project from R to Python 00:14:34 – How did Polars improve the project? 00:16:34 – Making the code more ergonomic and readable 00:19:21 – Only grabbing the data that is needed 00:20:37 – Titling and deciding to write the book 00:24:40 – Advantages to collaboration 00:29:34 – What were you excited to include in the book? 00:31:55 – Working with different engines and Nvidia’s Cuda 00:35:05 – Defining a Polars expression 00:36:11 – Transitioning from pandas to Polars 00:37:34 – Not needing an index 00:39:56 – What inspired the syntax? 00:45:01 – Defining lazy vs eager workflows 00:49:16 – Examples covered in first chapter preview 00:51:51 – Video Course Spotlight 00:53:14 – Data formats and Arrow 00:55:41 – Working with NaN, null, or None 00:58:11 – Measuring performance through a benchmark 00:59:12 – Advantages to working with the Discord community 01:02:32 – Code examples and applying the techniques 01:03:34 – Pydy Tuesday 01:05:47 – What are you excited about in the world of Python? 01:09:21 – What do you want to learn next? 01:13:26 – What’s the best way to follow your work online? 01:14:14 – Thanks and goodbye Survey: Listener Survey - Help Shape the Future of the Real Python Podcast Show Links: Python Polars: The Definitive Guide Janssens & Nieuwdorp - What we learned by converting a large codebase from Pandas to Polars - YouTube Polars — DataFrames for the new era polars · PyPI Xomnia - Home Page Episode #140: Speeding Up Your DataFrames With Polars Data Science at the Command Line
... Show More

    No results