Building Effective AI-Powered Data Pipelines

Free Lesson

Building Effective AI-Powered Data Pipelines

Part of Featured Lightning Lessons

•

Hosted by Shreya Shankar

2,123 students

In this video

What you'll learn

Architect Semantic Data Pipelines

Transform unstructured documents into structured data by chaining semantic operators and classifying data at scale.

Optimize via Semantic Rewrites

Improve accuracy by decomposing complex operators (e.g., converting a "Map" into "Split-Map-Reduce")

Slash Costs with Task Cascades

Learn how to route easy queries to cheaper models and hard ones to expensive models without sacrificing quality .

Why this topic matters

The true potential of LLMs isn't just in chatbots, but in reasoning over massive amounts of unstructured data like legal transcripts and reports. Shreya Shankar demonstrates how to architect pipelines that extract, classify, and aggregate structure from chaos accurately and efficiently.

You'll learn from

Shreya Shankar

ML Systems Researcher Making AI Evaluation Work in Practice

Shreya builds open-source systems for AI-powered data processing. She is a final-year PhD at UC Berkeley. Shreya created DocETL, an open-source system for analyzing unstructured text at scale. DocETL has been deployed across journalism, law, medicine, policy, finance, and urban planning. Her research has been published at top computer science venues including VLDB, SIGMOD, and UIST (including a Best Paper award). Before her PhD, Shreya worked as a machine learning and data engineer at startups. She holds a BS in Computer Science from Stanford University.

See all products from Hamel Husain & Shreya Shankar

Share this lesson

2,123 students

Share this lesson

2,123 students

Go deeper with a course

Featured in Lenny’s List

AI Evals For Engineers & PMs