FeatherCast

The voice of The Apache Software Foundation

Apache Big Data Seville 2016 – Hive 2.0 SQL, Speed, Scale – Alan Gates

Hive 2.0 SQL, Speed, Scale – Alan Gates

Apache Hive is the most commonly used SQL interface for Hadoop. To meet users data warehousing needs it must scale to petabytes of data, provide the necessary SQL, and perform in interactive time. The Hive community ihas produced a 2.0 release of Hive that includes significant improvements. These include:

* LLAP, a daemon layer that enables sub-second response time.
* HBase to store Hiveäó»s metadata, resulting in significantly reduced planning time.
* Using Apache Calcite to build a cost based optimizer
* Adding procedural SQL
* Improvements in using Spark as an engine for Hive execution

This talk will cover the use cases these changes enable, the architectural changes being made in Hive as part of building these features, and share performance test results on how these improvements are speeding up Hive.

More information about this talk

Apache Big Data: Keynote: Training Our Team in the Apache Way – Alan Gates

Keynote: Training Our Team in the Apache Way – Alan Gates

Hortonworks contributes to a number of Apache projects. When we started we depended on our many experienced Apache community members to train their fellow Hortonworkers in the Apache Way. But we grew quickly, and we found this started to break down. So we have instituted training for our teams in what Apache is, how it works, their responsibilities as part of Apache and how that meshes with their responsibilities as Hortonworkers, and a practical list of dos and don’t. This talk will share some thoughts on the need for this training, give an overview of the content, and review some early results.

More information about this talk

Blog at WordPress.com.