Hadoop Player MapR Wraps in SQL and JSON with Apache Drill 1.6

by Ostatic Staff - Apr. 07, 2016

MapR Technologies, which offers a popular distribution of Apache Hadoop that integrates web-scale enterprise storage and real-time database capabilities, announced the availability of Apache Drill 1.2 in its distribution back in Octobrer of last year.  The company also announced a new Data Exploration Quick Start Solution, and had previously wrapped Apache Spark into its platform. Steadily, primarily through embracing cutting-edge open source tools, MapR has building out what it refers to as a fully "converged data platform."

Now, the company has announced the availability of Apache Drill 1.6 as the unified SQL layer for the MapR Converged Data Platform via tighter integration with MapR-DB.  Users can leverage reporting and analytics on JSON data stored in MapR-DB tables, potentially realizing faster time-to-value with insights gleaned from operational data. 

According to Hadoop Weekly, “The Apache Drill project has one of the fastest release velocities in the Hadoop ecosystem with a new release nearly every month.”  Version 1.6 of Apache Drill, which is now available on the MapR Converged Data Platform, offers a new MapR-DB document database plugin, enhanced performance and scale, and optimized Tableau and BI tool experience. 

When we talked with MapR officials a few months back, they discussed Drill:

"Drill’s unique value comes from its capability to query data without requiring pre-defined schemas. This not only allows for instant querying on newly-ingested data in Hadoop but also avoids the constant maintenance associated with evolving schema requirements for diverse data types. No ETL process or DBA intervention is required at any stage of the data lifecycle.  That said, Drill can also leverage any defined schema in the Hive metastore."

Thousands of users have downloaded Drill and numerous organizations have it in production. MapR also claims that 6,000 BI analysts and developers worldwide have completed Drill training courses through a free On-Demand Training program from the company.

Apache Drill is a game changer for us,” said Edmon Begoli, CTO of PYA Analytics. “Most recently, we have been able to query, in under 60 seconds, two years worth of flat PSV files of claims, billing, and clinical data from commercial and government entities, such as the Centers for Medicaid and Medicare Services. Drill has allowed us to bypass the traditional approach of ETL and data warehousing, convert flat files into efficient formats such as Parquet for improved performance, and use plain SQL against very large volumes of files." 

Highlights of Drill 1.6 include:

Flexible and operational analytics on NoSQL – The new MapR-DB document database plugin allows analysts to perform SQL queries directly on JSON data stored in MapR-DB tables. There are a variety of pushdown capabilities available with this plugin to provide optimal interactive experience.

Enhanced query performance – Provides better query performance on data in Hadoop and NoSQL systems via numerous query planning improvements, such as partition pruning, metadata caching and other optimization improvements. Delivers up to 10-60X performance gains in query planning compared to the previous releases of Drill.

Better memory management – Delivers greater stability and scale which enables customers to run not only larger but also more SQL workloads on a MapR cluster.

Improved integration with visualization tools like Tableau – Offers metadata query performance improvements and introduces client impersonation for end-to-end security from the visualization tool to data in Hadoop.  Version 1.6 also provides enhanced SQL Window functions.

 Operational analytics on document databases such as MapR-DB is a rapidly growing use case,” said Neeraja Rentachintala, senior director, Product Management, MapR Technologies. “For the first time, there is a stack that allows BI developers and business analysts to store and query data in native formats without cumbersome ETL or transformation, providing end-to-end flexibility and scale.”

You can learn more about the Drill 1.6 product and its key features here.