Using Apache Beam to get data in your data lake? In a agile company you don’t want to re-compile your ingestion pipeline every time a sprint finished. In this talk we go over all mechanisms and building blocks you need to make dynamic pipelines really work.
We’ll see why schemas are so important. How do we get these schemas in our pipelines and discuss methods to protect ourselves from data corruption and incompatible schema evolution.
The new features like schema aware PCollection get a thorough deep dive and finally we go over real world examples and position Apache Beam in the new PLT (Push Load Transform) world.
Driving dynamic Beam pipelines Alex Van Boxel
September 13, 2019