FeatherCast

The voice of The Apache Software Foundation

Apache Big Data Seville 2016 – Mining and Identifying Security Threat Using Spark SQL, HBase and Solr – Manidipa Mitra

December 5, 2016
asfinfra

Mining and Identifying Security Threat Using Spark SQL, HBase and Solr – Manidipa Mitra

This presentation will talk about how to deisgn a highly effective scalable/performant distributed system to find the identity theft and fraud by mining billions of records related to share holding for a leading financial organization. This will also discuss on how Tera bytes of data can be migrated from Oracle to Hadoop, stored in parquet format, processed in a distributed computing framework with Spark DataFrame and pushed to different service layer (HBase, Impala, Solr, HDFS) depends on the query/access pattern. This design will also throw light on how the frequent transactions were handled and data were pre-processed end of the day to meet the seconds response time SLA, creating thousands of report by mining millions of record in minutes time.

More information about this talk

Leave a Reply

Powered by WordPress.com.
%d bloggers like this: