The voice of The Apache Software Foundation

Building Data Platform for your Next Meetup Event with Apache Foundation on Cloud Chengzhi Zhao

September 12, 2019

One of the challenges at Meetup is how to build a scalable, reliable and efficient data platform to help our ML team builds models that recommend events fit your interests. With emerging sophisticated batch and streaming frameworks and cloud solutions, our data platform went through massive changes in the past two years. In this talk, I’ll discuss the evolution of how Meetup data platform utilizes Apache-based data systems, including Sqoop, Hive, Flume, Spark, Flink, Beam, Airflow. I’ll talk about architecture changes to our batch and stream pipeline solutions and what pros/cons to move data platform 100% to cloud. I’ll also share some lessons we learned and best practices on building distributed systems for data platform, and how the data platform collaborates with machines learning and data science team.

Leave a Reply

Powered by WordPress.com.
%d bloggers like this: