FeatherCast

The voice of The Apache Software Foundation

Running Apache Flink and Apache Beam on Kubernetes Micah Wylde

September 13, 2019
timothyarthur

Access to real-time data is increasingly important for many organizations. At Lyft, we process millions of events per second in real-time to compute prices, balance marketplace dynamics, detect fraud, among many other use cases. To do so, we run dozens of Apache Flink and Apache Beam pipelines. Flink provides a powerful framework that makes it easy for non-experts to write correct, high-scale streaming jobs, while Beam extends that power to our large base of Python programmers.
Historically, we have run Flink clusters on bare, custom-managed EC2 instances. In order to achieve greater elasticity and reliability, we decided to rebuild our streaming platform on top of Kubernetes. In this session, I’ll cover how we designed and built an open-source Kubernetes operator for Flink and Beam, some of the unique challenges of running a complex, stateful application on Kubernetes, and some of the lessons we learned along the way.

Leave a Reply

Required fields are marked *.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
%d bloggers like this: