The voice of The Apache Software Foundation

Building Data Platform for your Next Meetup Event with Apache Foundation on Cloud Chengzhi Zhao

September 12, 2019

One of the challenges at Meetup is how to build a scalable, reliable and efficient data platform to help our ML team builds models that recommend events fit your interests. With emerging sophisticated batch and streaming frameworks and cloud solutions, our data platform went through massive changes in the past two years. In this talk, I’ll discuss the evolution of how Meetup data platform utilizes Apache-based data systems, including Sqoop, Hive, Flume, Spark, Flink, Beam, Airflow. I’ll talk about architecture changes to our batch and stream pipeline solutions and what pros/cons to move data platform 100% to cloud. I’ll also share some lessons we learned and best practices on building distributed systems for data platform, and how the data platform collaborates with machines learning and data science team.

Leave a Reply

Required fields are marked *.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
%d bloggers like this: