Q&A: MapR's Jack Norris on the Impact of Microservices
It’s easy to underestimate the impact of microservices on today’s technology infrastructure. The evolution of microservices started as a bit of a backlash against the complexity of SOA and ESB practices. Over the last decade, there has been a strong movement toward a more flexible style of building large systems. The idea behind microservices is simple: larger systems can be built by decomposing their functions into relatively simple, single-purpose services that communicate via lightweight and simple techniques.
MapR SVP Jack Norris (seen here) is an expert on microservices. We caught up with him for an interview. Here are his thoughts.
Some businesses are approaching preventing large-scale failure by isolating problems, leveraging microservices. How does such an approach work and what are the benefits to the business?
A microservices approach is aligned quite well to a typical big data deployment. In the past, bigger and faster servers were the core of the typical architecture paradigm. These days, you gain cost-effective scaling by deploying solutions across many commodity hardware servers. In this configuration, you also gain modularity and extensive parallelism. Microservices offer the same benefits. They are more cost-effective because they are easier to build and maintain, and they also promote modularity that can help prevent large-scale failures. The modularity not only facilitates the identification of problems, but it also helps to avoid single points of failure.
This is important for businesses today because they are realizing that as they mature in their big data journey, more focus turns to getting the infrastructure right. This starts with how data is stored and managed. This means that core principles around performance, scale, reliability, and ease-of-use will dominate the priorities. Microservices complement these priorities with efficiency advantages.
What are some other key ways that microservices are impacting big data architectures?
Microservices also promote agility and faster time to value. Big data necessarily should be leveraged in many ways with an expanding number of use cases, so it’s important to have a strategy that can lower the effort for deploying new applications. A microservices architecture lets you more easily process and analyze data in parallel pipelines which lets you investigate new opportunities while keeping your current system in production. This lets you continually make changes to your environment without disrupting current operations, and also lets you more quickly identify new value from your data.
How has the move toward streaming data, and away from batch processing, influenced approaches to managing data and infrastructure?
Many organizations are embracing a “stream first” paradigm in which new data initiatives are based on real-time streaming data. Both the Lambda Architecture and the Kappa Architecture are early examples of the popularity of this type of data management strategy. The use of streams continues to evolve. Not only are streams good for real-time processing, but they lead to more flexible ways to manage big data. Microservices are a perfect example. In such an environment, a publish-subscribe framework acts as the lightweight communications system between the componentized applications. Organizations are also using streams as the system of record so that different materialized views of data can be created from a single “golden copy” of immutable data. If any of the views need to be updated due to business changes or even coding errors, the master stream is used to recreate the view. Streams are also an ideal way to handle many machine learning environments where new models are continually developed and augmented on streaming data, and then tested in parallel against existing models.
MapR talks frequently about “message-driven architectures?” What does that mean specifically?
“Message-driven architectures” are described in several ways such as “streams-first” and “event-driven.” An “event-driven architecture” views data as a collection of ordered data points, which make up a stream of data. The ideal way of handling streaming data is with a publish-subscribe framework in which data points are queued in streams by “publishers” and then can be read in order by any number of “consumers.” For an event-driven architecture to support a broad range of applications, the publish-subscribe system needs to be fast, easily scale, replicate globally, and ensure reliability. We converge stream processing into a converged data platform. This dramatically simplifies development and makes real-time applications that rely on complex flows of data with automated adjustments possible.
What technologies in the market should IT professionals investigate for microservices deployments?
The key components start with a fast, scalable, and reliable publish-subscribe system. A distributed store is necessary for enabling persistence, especially for advanced, stateful microservices. And access to a wide variety of compute engines is important for identifying the right tool for a given microservice’s main purpose. MapR provides all these components in a single platform with MapR Streams for the streamed data, MapR-FS and MapR-DB for persistence, and the ecosystem of tools in Apache Hadoop and Apache Spark for large-scale processing. MapR is working with technology partners like Cask and StreamSets to further enhance microservices-based deployments with productivity tools that make it easier to build, share and deploy microservices.
What are some industries that benefit from a microservices approach?
Microservices will actually benefit any industry because it promotes the agility and time to value that all organizations seek from their big data. This includes telecommunications, financial services, retail and ad/media, among others. Solution categories that especially benefit from a microservices architecture include any SaaS deployment, machine learning environments, predictive analytics, and real-time analytics.
About Jack Norris, SVP, Data & Applications, MapR:
Jack drives understanding and adoption of new applications enabled by data convergence. With over 20 years of enterprise software marketing experience, he has demonstrated success from defining new markets for small companies to increasing sales of new products for large public companies. Jack’s broad experience includes launching and establishing analytic, virtualization, and storage companies and leading marketing and business development for an early-stage cloud storage software provider. Jack has also held senior executive roles with EMC, Rainfinity (now EMC), Brio Technology, SQRIBE, and Bain and Company. Jack earned an MBA from UCLA Anderson and a BA in Economics with honors and distinction from Stanford University.