FeatherCast

The voice of The Apache Software Foundation

Driving dynamic Beam pipelines Alex Van Boxel

September 13, 2019
timothyarthur

Using Apache Beam to get data in your data lake? In a agile company you don’t want to re-compile your ingestion pipeline every time a sprint finished. In this talk we go over all mechanisms and building blocks you need to make dynamic pipelines really work.
We’ll see why schemas are so important. How do we get these schemas in our pipelines and discuss methods to protect ourselves from data corruption and incompatible schema evolution.
The new features like schema aware PCollection get a thorough deep dive and finally we go over real world examples and position Apache Beam in the new PLT (Push Load Transform) world.

Leave a Reply

Required fields are marked *.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
%d bloggers like this: