IBM Makes Giant Commitment to Apache Spark
Folks everywhere in the Big Data and Hadoop communities are becoming increasingly interested in Apache Spark, an open source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley, and we covered updates to Spark just last week. Now, IBM has announced a major commitment to Apache Spark, billing it as "potentially the most important new open source project in a decade that is being defined by data."
Need evidence of Big Blue's commitment? The company plans to embed Spark into its Analytics and Commerce platforms, and to offer Spark as a service on IBM Cloud. IBM will also put more than 3,500 IBM researchers and developers to work on Spark-related projects at more than a dozen labs worldwide; donate its IBM SystemML machine learning technology to the Spark open source ecosystem; and educate more than one million data scientists and data engineers on Spark. Wow.
IBM is tying its Spark initiatives to the rise of the Internet of Things, too. As data and analytics are embedded into all kinds of objects and apps as part of the Internet of Things (IoT) push, IBM claims that "Spark brings essential advances to large-scale data processing." The company says it dramatically improves the performance of data dependent apps. And, it purportedly radically simplifies the process of developing intelligent apps, which are fueled by data.
"IBM has been a decades long leader in open source innovation. We believe strongly in the power of open source as the basis to build value for clients, and are fully committed to Spark as a foundational technology platform for accelerating innovation and driving analytics across every business in a fundamental way," said Beth Smith, General Manager, Analytics Platform, IBM Analytics. "Our clients will benefit as we help them embrace Spark to advance their own data strategies to drive business transformation and competitive differentiation."
Interestingly, IBM, NASA, and the SETI Institute are collaborating to analyze terabytes of complex deep space radio signals using Spark's machine learning capabilities in a hunt for patterns that might betray the presence of intelligent extraterrestrial life. You can find out more about that effort and what IBM's partners are doing with Spark here.
IBM's news sounds like one of the biggest commitments to Spark yet seen, and the whole ecosystem surrounding Spark is likely to be moved forward through the money Big Blue is spending, and the development it is promising will happen.