Data Engineering Podcast

Data Engineering Podcast


Latest Episodes

An Exploration Of The Expectations, Ecosystem, and Realities Of Real-Time Data Applications
August 21, 2022

An interview with Shruti Bhat about the state of the ecosystem for real-time data applications and the motivating factors for when and how to build them.

Bringing Automation To Data Labeling For Machine Learning With Watchful
August 13, 2022

An interview with Shayan Mohanty about the challenges of building repeatable data labeling processes and how Watchful is building a platform to let domain experts codify their knowledge for automated

Collecting And Retaining Contextual Metadata For Powerful And Effective Data Discovery
August 13, 2022

An interview with Shinji Kim about the challenges of collecting contextual metadata for your information assets and how to organize it to power effective data discovery for everyone in the business

Useful Lessons And Repeatable Patterns Learned From Data Mesh Implementations At AgileLab
August 06, 2022

An interview with Paolo Platter about the experience that he and his team at AgileLab have had implementing Data Mesh strategies at multiple organizations and the repeatable patterns that they have bu

Optimize Your Machine Learning Development And Serving With The Open Source Vector Database Milvus
August 06, 2022

An interview with Frank Liu about the open source vector database Milvus and how its native storage of vector embeddings reduces the friction involved in building and deploying machine learning models

What "Data Lineage Done Right" Looks Like And How They're Doing It At Manta
July 31, 2022

An interview with Ernie Ostic about the Manta platform and how it approaches the collection and processing of metadata to build a comprehensive view of data lineage across your various data systems

Interactive Exploratory Data Analysis On Petabyte Scale Data Sets With Arkouda
July 31, 2022

An interview with David Bader about the Arkouda framework for exploratory data analysis at interactive speeds across massive data sets and how it supports operating from a single laptop to multiple se

Writing The Book That Offers A Single Reference For The Fundamentals Of Data Engineering
July 24, 2022

An interview with Joe Reis and Matt Housley about their experience and insights gained while writing the book "Fundamentals of Data Engineering" and the inherent challenges of offering a sin

Re-Bundling The Data Stack With Data Orchestration And Software Defined Assets Using Dagster
July 24, 2022

An interview with Nick Schrock about the role of the data orchestration engine in making sense of the modern data stack and how Dagster's support for software defined assets simplifies the work of bui

Making The Total Cost Of Ownership For External Data Manageable With Crux
July 17, 2022

An interview with Mark Etherington, CTO of Crux, about the cost and complexity involved in external data integration and how their platform is engineered to make it manageable for organizations of all