Apache Big Data Seville 2016 – Smart Storage Management: Towards Higher HDFS Storage Efficiency – Wei Zhou

Smart Storage Management: Towards Higher HDFS Storage Efficiency – Wei Zhou

All kinds of data volume increases dramatically in recent years, new storage devices (NVMe SSD, flash SSD, etc.) can be utilized to improve data access performance. HDFS provides methodologies like HDFS Cache, Heterogeneous Storage Management (HSM) and Erasure Coding (EC) to provide such support, but it remains a big challenge to define and adjust different storage strategies for different data in a dynamic environment.

To overcome the challenge and improve the storage efficiency of HDFS, we will introduce a comprehensive solution, aka Smart Storage Management (SSM) in Apache Hadoop. HDFS operation data and system state information are collected from the cluster, based on the metrics collected SSM can extract some äóìdata access patternsäó and based on these patterns SSM will automatically make sophisticated usage of these methodologies to optimize HDFS storage efficiency.

More information about this talk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s