Code Analyzer uses a code-centric approach that presents the developer with a code block and correlates it with the timeline of cluster resources consumed during execution. This enables developers to pinpoint specific segments of code and stages that require optimization.
For example, if an application is consuming a lot of CPU while two stages are running in parallel, it is impossible to determine which stage is causing the problem when only using the Apache Spark UI. Because Code Analyzer provides a time-series view overlaid with parallel stages, it becomes almost trivial to understand which stage is causing the high CPU usage.
Using Apache Spark UI or other tools, it is very difficult to understand variance in application runtime performance. This is because none of these tools provides the entire context of what else is running on the cluster. Code Analyzer shows the “cluster weather” so that developers can understand whether the performance variation was due to their application or the context within which the application was running.