I joined Pepperdata to help scale the company in both the technology and business dimensions. Pepperdata’s early success in enterprises deploying Hadoop is proof that the market for systems that deliver predictable application performance for distributed computing is beginning to take off.

As we all know, organizations now have access to more data than ever before from an increasing number of sources, and big data has fundamentally changed the way all that information is managed. The promise of big data is the ability to make sense of the many data sources by using real-time and ad hoc analysis to derive time-critical business insights, enabling organizations to become smarter about their customers, operations, and overall business. As volumes of business data increase, organizations are rapidly adopting distributed systems to store, manage, process, and serve big data for use in analytics, business intelligence, and decision support.
Beyond the world of big data, the use of distributed systems for other kinds of applications has also grown dramatically over the past several decades. Just a few examples include physics simulations for automobile and aircraft design, computer graphics rendering, and climate simulation.

Unfortunately, fundamental performance limitations of distributed systems can prevent organizations from achieving the predictability and reliability needed to realize the promise of large-scale distributed applications in production, especially in multi-tenant and multi-workload environments. The need to have predictable performance is more critical than ever before because for most businesses, information is the competitive edge needed to survive in today’s data-driven economy.

Computing resources even in the largest organizations are finite, and, as computational demands increase, bottlenecks can result in many areas of the distributed system environment. The timely completion of a job requires a sufficient allocation of CPU, memory, disk, and network to every component of that job. As the number and complexity of jobs grow, so does the contention for these limited computing resources. Furthermore, the availability of individual computing resources can vary wildly over time, radically increasing the complexity of scheduling jobs. Business-visible symptoms of performance problems resulting from this complexity include under- or over-utilized hardware (sometimes both at the same time), jobs completing late or unpredictably, and even cluster crashes.

While manual performance tuning will always have a role, many challenges encountered in today’s distributed system environments are far too complex to be solved by people power alone: a more sophisticated, software and machine learning based solution is necessary to achieve predictability and reliability.

Pepperdata meets this need; its software actively governs and guarantees consistent peak performance of distributed systems in production. By guaranteeing stable and reliable cluster performance, Pepperdata allows enterprises to realize untapped value from existing distributed infrastructures and finally apply big data to more use cases to meet business objectives.