Apache Big Data Seville 2016 – Mining and Identifying Security Threat Using Spark SQL, HBase and Solr – Manidipa Mitra

Mining and Identifying Security Threat Using Spark SQL, HBase and Solr – Manidipa Mitra

This presentation will talk about how to deisgn a highly effective scalable/performant distributed system to find the identity theft and fraud by mining billions of records related to share holding for a leading financial organization. This will also discuss on how Tera bytes of data can be migrated from Oracle to Hadoop, stored in parquet format, processed in a distributed computing framework with Spark DataFrame and pushed to different service layer (HBase, Impala, Solr, HDFS) depends on the query/access pattern. This design will also throw light on how the frequent transactions were handled and data were pre-processed end of the day to meet the seconds response time SLA, creating thousands of report by mining millions of record in minutes time.

More information about this talk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s