This book uses Hadoop as an example of a multi-tenant distributed system. Hadoop serves as an ideal example of such a system because of its broad adoption across a variety of industries, from healthcare to finance to transportation. Due to its open-source availability and a robust ecosystem of supporting applications, Hadoop’s adoption is increasing among small and large organizations alike.
Hadoop is also an ideal example because it is used in highly multi-tenant production deployments (running jobs from many hundreds of developers) and is often used to simultaneously run large batch jobs, real-time stream processing, interactive analysis, and customer-facing databases. As a result, it suffers from all of the performance challenges described herein.