This month marks the two-year anniversary of the Apache Hadoop community’s decision to decouple YARN from MapReduce and promote it as a separate sub-project of Apache Hadoop. YARN effectively opens up the Hadoop platform to new applications and modes of processing beyond MapReduce.
Hadoop 1 featured HDFS as the data storage layer for Hadoop and MapReduce as the data-processing layer. Although MapReduce has proven to be extremely powerful, there are a number of use cases where the batch-oriented technology is insufficient. YARN is the centerpiece of Hadoop 2 and features a generic resource-management and distributed application framework for Hadoop that allows multiple data processing technologies such as Apache Spark, Storm, Solr, and Cloudera Impala to run side by side with MapReduce in a cluster.
Adoption of YARN is growing quickly. During a keynote session at the Hadoop Summit in June 2013, Bruno Fernandez-Ruiz, Senior Fellow & VP Platforms at Yahoo, noted that many of their production applications were already running on YARN. In a keynote presentation at this summer’s 2014 Hadoop Summit, Hortonworks cofounder Arun Murthy estimated that more than 50% of Hortonworks’ installed base customers are running YARN.
We’re seeing tremendous enthusiasm for YARN among our customers and prospects in a wide range of industries and across the spectrum of Hadoop distros. The rapidly expanding set of use cases enabled by YARN is leading companies to run more diverse workloads on their clusters. While this represents an extremely positive development, it also increases the level of resource contention that can lead to missed SLAs.
At Pepperdata, we’re excited about supporting YARN in order to meet the growing demand for reliable production clusters running Hadoop 2 with YARN. As the summer winds down and we move into fall, we’ll continue to provide updates and perspectives (both in this blog and at the various Meetups we host frequently in the Bay Area) on how YARN is evolving in production environments, and what it means for Hadoop operators.