FeatherCast

The voice of The Apache Software Foundation

Ozone: Evolving HDFS Scalability to new heights & built-in GDPR Compliance Dinesh Chitlangia

September 12, 2019
timothyarthur

Apache Hadoop Ozone is a robust, distributed key-value object store for Hadoop with layered architecture and strong consistency. It separates the namespace management from block and node management layer, which allows users to independently scale on both axes. Ozone is interoperable with Hadoop ecosystem as it provides OzoneFS (Hadoop compatible file system API), data locality and plug-n-play deployment with HDFS as it can be installed in an existing Hadoop cluster and can share storage disks with HDFS. Ozone solves the scalability challenges with HDFS by being size agnostic. Consequently, it allows users to store trillions of files in Ozone and access them as if they are on HDFS. Ozone plugs into existing Hadoop deployments seamlessly, and programs like Yarn, MapReduce, Spark, Hive and work without any modifications. In the era of increasing need for data privacy and regulations, Ozone also aims to provide built-in support for GDPR compliance with strong focus on Right to be Forgotten i.e., Data Erasure. At the end of this presentation the audience will be able to understand: 1. Overview of current challenges with HDFS scalability 2. How Ozone’s Architecture solves these challenges 3. Overview of GDPR 4. Built-in support for GDPR in Ozone

Leave a Reply

Powered by WordPress.com.
%d bloggers like this: