FeatherCast

The voice of The Apache Software Foundation

What’s Surprising about Apache Drill and Why That’s a Challenge Ellen Friedman

September 12, 2019
timothyarthur

‘Apache Drill has some very surprising characteristics and, more importantly, it enables Drill users to do some surprising things. It’s no longer surprising to be able to do standard SQL in a highly distributed and large scale system – there is an entire class of modern tools that do this including Apache Hive, Presto or Spark SQL. But Drill has other capabilities that are surprising and make it stand apart from its class. For one thing, Drill provides an extraordinary degree of flexibility in several ways, including: Support for a wide variety of file formats including semi-structured and nested data (such as Parquet, JSON, Avro) and non-file data sources – not just data access but ability to fully use these data sources with high performance Schema discovery – a capability that opens up data exploration in unexpected ways for Drill users and allows progressive data modelling Easy extensibility with high performance – you don’t have to trade one for the other These are valuable if surprising capabilities. Why, then, is that a challenge? Because people don’t expect them, they also may not come looking for a tool that can do these things. The challenge comes in how to make potential users aware of the opportunities that Drill offers. This talk will explore some of Drill’s surprising capabilities, how it’s able to do these things and what impact that has for Drill users. In addition, we will open a discussion about how best to inform and engage a broader user community. This latter issue is not only important for Drill but for other Apache projects as well.’

Leave a Reply

Powered by WordPress.com.
%d bloggers like this: