FeatherCast

The voice of The Apache Software Foundation

Data Movement & Integration at PayPal & LinkedIn using Apache Gobblin Jay Sen Sudarshan Vasudevan

September 13, 2019
timothyarthur

Data replication at PayPal drives various different business use-cases from fraud detection, user behavioral analysis, credit checks to lot of other offline business decisions. During this talk, we will present how Apache Gobblin empowers data movement and integrations at PayPal in partnership with LinkedIn to showcase all the recent features as well as the planned roadmap for the platform. Apache Gobblin is a distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems. In the second half of this presentation, we will present recent additions to Gobblin including: 1. A new declarative approach for defining data pipelines using Gobblin-as-a-Service, and 2. Real world experiences running hybrid batch and streaming pipelines using Gobblin.

Leave a Reply

Required fields are marked *.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.
%d bloggers like this: