“Queries are a significant portion of our customers’ big data workloads, so we know the performance of these workloads is critical. IT and applications teams can now get visibility into their Hive and Impala queries in one place, compare the runs of their queries and take advantage of the recommendations Query Spotlight provides,” says Ash Munshi, CEO, Pepperdata. “We’re confident Query Spotlight can increase the performance of their Impala queries while helping them decrease overall costs.”

Are your Apache Impala queries running slow and not achieving peak performance? Given Impala’s complexity, troubleshooting can be very difficult. Optimizing query performance is near impossible without the right tools. Good news: Pepperdata Query Spotlight now supports Apache Impala.

Query Spotlight makes it easy for operators and developers to understand the detailed Hive query performance characteristics of their queries and workloads, together with infrastructure-wide issues that impact these workloads. With the addition of Impala support, this important category of query workloads can now be tuned, debugged, and optimized for better performance and reduced costs.

The Apache Impala Advantage

Apache Impala is an open-source MPP (Massive Parallel Processing) SQL query engine built to process large volumes of data. Impala delivers extremely high performance and low latency, as opposed to other popular SQL engines for Hadoop.

Impala utilizes standard components including HBase, HDFS, YARN, Sentry, and Metastore. This capability allows Impala users to enjoy the benefits of combined SQL support, in addition to the flexibility and scalability of Apache Hadoop. With Impala, you can process stored data in HDFS at light speed using traditional SQL knowledge. You can also access data stored in Amazon S3, HBase, and HDFS—even without Java knowledge.

Query Spotlight: Visibility into Apache Impala

Query Spotlight for Impala gives developers and operators the big picture of their platform performance and helps them slash their operating costs. From detailed stats, query plans, breakups of every query duration, and more—the visibility is unparalleled. Query Spotlight also provides visibility into Impala databases and tables. The recommendation engine includes system-level recommendations as well as query-level recommendations—joins included.

In addition to visualizing detailed query information on resource utilization and database views, Query Spotlight enables Impala users to create and receive alerts about queries, remediate issues, and optimize query performance.

Query Spotlight enables developers to:

  • See Impala-specific SQL query planning and execution information.
  • Rapidly ascertain query plan problems.
  • Analyze Impala query performance.
  • Identify bottlenecks that contribute to slow queries.
  • Speed up time to resolution.

Operators can quickly narrow down problematic queries in a multi-user environment and use query performance insights to optimize cluster resources and improve productivity. To summarize, Query Spotlight now supporting Impala brings the following benefits to the table:

  • Visibility into all your Hive and Impala queries in one place in a similar format
  • Recommendations to improve the performance of your queries
  • Comparison of query runs with chargeback reports

More than one third of IT spend is spent on troubleshooting, performance, and availability. On top of that, 80% of organizations are going beyond their big data budgets. Inefficient queries are a big part of this, creating missed SLAs and slow database resources. Query Spotlight for Impala changes all this for the better.