The voice of The Apache Software Foundation

Apache Big Data Seville 2016 – Smart Storage Management: Towards Higher HDFS Storage Efficiency – Wei Zhou

December 9, 2016

Smart Storage Management: Towards Higher HDFS Storage Efficiency – Wei Zhou

All kinds of data volume increases dramatically in recent years, new storage devices (NVMe SSD, flash SSD, etc.) can be utilized to improve data access performance. HDFS provides methodologies like HDFS Cache, Heterogeneous Storage Management (HSM) and Erasure Coding (EC) to provide such support, but it remains a big challenge to define and adjust different storage strategies for different data in a dynamic environment.

To overcome the challenge and improve the storage efficiency of HDFS, we will introduce a comprehensive solution, aka Smart Storage Management (SSM) in Apache Hadoop. HDFS operation data and system state information are collected from the cluster, based on the metrics collected SSM can extract some äóìdata access patternsäó and based on these patterns SSM will automatically make sophisticated usage of these methodologies to optimize HDFS storage efficiency.

More information about this talk

ApacheCon NA 2016: Zhe Zhang – HDFS

April 21, 2016

Zhe Zhang will be speaking at ApacheCon North America in Vancouver in just about 2 weeks. He’ll be speaking about HDFS Erasure Coding in data storage for Apache Hadoop and other projects.

(If the above player doesn’t work for you, you can download the podcast HERE.)


Powered by WordPress.com.