Apache Spark Uniquely Powers Splice Machine Database for In-Memory Tasks

by Ostatic Staff - Nov. 18, 2015

Splice Machine has announced the 2.0 version of its RDBMS, which it bills as "the first hybrid in-memory RDBMS powered by Hadoop and Spark." Apache Spark has rocketed to success as an in-memory data processing framework that is now widely used with Hadoop, and it's also becoming a hub around which other data-processing tools work.

Splice Machine’s version 2.0 aims to be a database solution that incorporates both the scalability of Hadoop, ANSI SQL, ACID transactions, and the in-memory performance of Spark. Splice claims that its new database enables businesses to perform simultaneous OLAP and OLTP workloads and increases performance over traditional database systems, such as Oracle and MySQL, by 10-20X at one-fourth the cost. 

As Computerworld notes:

"Splice Machine originally made a name for itself as a replacement for multiterabyte workloads on conventional ACID RDBMS solutions like Oracle. The company claimed it enabled workloads for one former Oracle customer to run an order of magnitude faster, and Hadoop's native scale-out architecture meant the solution could grow with the size of workloads at a lower cost than with a conventional RDBMS."

Unlike in-memory only databases, the Splice Machine RDBMS aims to not force companies to put all of their data in-memory, which can become prohibitively expensive as data volume grows. It uses in-memory computation to materialize the intermediate results of long-running queries but uses HBase to store and access data at scale.

“By delivering an affordable, fully operational platform that is designed to support OLTP and OLAP workloads concurrently, Splice Machine 2.0 offers a unique and powerful way for businesses to perform real-time analytics and operational queries together without sacrificing performance or breaking the bank,” said Charles Zedlewski, Vice President, Products at Cloudera. “As more customers start to run Spark on Cloudera’s platform, Splice Machine’s integration complements the analytical capabilities of our enterprise data hubs, enabling customers across a variety of industries to handle all types of workloads with greater efficiency.”

Among other things, this new database offering is evidence of just how much influence Apache Spark has. IBM has made a multi-billion dollar commitment to Spark, and the future of Hadoop appears to be tied to Spark as well.