AtScale Benchmark Quantifies Business Intelligence Queries on Hadoop

by Ostatic Staff - Feb. 26, 2016

How long are users willing to wait for answers to queries that they throw at today's Big Data tools. Not long, according to AtScale, which provides business users with speedy Business Intelligence solutions on Hadoop. The company released the results of a comprehensive Business Intelligence benchmark for SQL-on-Hadoop engines. The full benchmark results can be viewed for free at www.atscale.com/benchmark

 The benchmark tested the industry’s top SQL-on-Hadoop engines over key Business Intelligence (BI) use case queries.  The benchmark reveals and rates strengths and weaknesses of the engines, and reveals which ones are ideally suited to various scenarios. Here are details.

“We used real-world enterprise experience to produce a document that every technical evaluator can use as part of their evaluation process”, says Josh Klahr, VP of Product Management at AtScale.

Some surprising findings that surfaced include:

While Hive is generally a default for SQL on Hadoop, across all scenarios it doesn’t always perform well on its own.

While Cloudera Impala is known as a strong player when it comes to SQL-on-Hadoop, the benchmark study found “winners” varied depending on the type of query, size of data and other factors. Each engine has its own ‘sweet spot’ and the study reveals which engine is best for different scenarios.

The upgrades to Spark announced recently made a big difference in performance on smaller data sets.

“This benchmark will provide a useful data point for those assessing business intelligence workloads on Hadoop,” said Tom Pringle, Head of Applications Research at Ovum. “We’ve seen an increase in adoption of Hadoop, and most often the focus has been on storage and scale-out capabilities of the new platform.  As more organizations consider analytical workloads on Hadoop, it will be important that they assess the capabilities of SQL-on-Hadoop solutions.”

As indicated in the latest Hadoop Maturity Survey, Business Intelligence is now a top workload for Hadoop, ahead of Data Science and ETL.  The BI-on-Hadoop benchmark paper details the methodology and framework used for the study.  The full document can be viewed for free at www.atscale.com/benchmark

At one point, the Big Data trend--sorting and sifting large data sets with new tools in pursuit of surfacing meaningful angles on stored information--remained an enterprise-only story, but now businesses of all sizes are evaluating tools that can help them glean meaningful insights from the data they store.  AtScale came out of stealth mode recently with its focus on BI and Hadoop, and you can find out more about the company here.