Can you believe that humanity has created 90% of today’s data within the last two years alone, at the rate of 2.5 quintillion bytes of data per day? While this data is mainly derived from the internet, including social media, web searches, text messages, and media files, IoT devices and sensors are becoming a significant source of data as well.  These are the key drivers for the global big data market, which reached $42 billion dollars in 2018, and is forecast by Statista/Wikibon to grow to $64 billion by 2021.

Big Data market size, based on revenue, from 2011 to 2027 ($U.S. billion)

Source: Wikibon; SiliconANGLE, Statista, 2019

Big data is a key driver of overall growth in stored data. According to Cisco’s latest Cloud Global Index report, the volume of stored big data will reach 403 Exabytes (EB) by 2021, up almost 8-fold from 51 EB in 2016. Big data alone will represent 30 percent of data stored in data centers by 2021, up from 18 percent in 2016.  Big data is defined here as data deployed in a distributed processing and storage environment, such as Hadoop, Spark or NoSQL clusters. SQL-on-Hadoop engines from vendors like IBM and Oracle can support massive databases have grown in popularity, augmenting the growth of the big data ecosystem.

So big data is, well, bigger than ever.  The world is powered by big data and will be for the foreseeable future.  But the relentless growth in big data is like a double-edged sword. While businesses are deriving tremendous insights from being able to analyze large data sets, development teams are dealing with more resource-hungry workloads.  Because of these related pressures, application performance management (APM) has become an essential element in today’s big data ecosystem. An examination of big data deployments in key verticals shows us what’s keeping developers busy and why APM is a must-have.

Financial services

The banking industry is leading the growth in big data and business analytics services, whose revenues will surpass $205 billion by 2020, according to the European Banking Federation. Banks are embracing data scientists and integrating advanced tools running on artificial intelligence and cloud computing to structure data more efficiently.  Financial services that are enhanced and benefit from data analytics are endless. Think of lending, saving, investing, compliance, fraud monitoring, due diligence, customer insights, anti-money laundering, credit scoring and payments. For example, since credit cards produce so much data and can quickly fall into the wrong hands, fraud has become rampant. Big Data and machine learning are helping to police this illegal activity, stopping fraud before it even starts. In many cases, you’ll receive a notification on your smartphone asking if you are indeed the one making the purchase. If not, then you can halt the transaction and start the process of regaining your financial privacy and security.

Healthcare

Controlling health care costs while striving to improve the quality of care is the greatest challenge facing the healthcare sector.  A recent survey from Black Book Research found that 93% of hospital and physician financial executives are actively seeking ways to use big data analytics to link care with and patient outcomes. To that end, many are using big data analytics as a foundation for clinical documentation improvement (CDI). CDI is the process of enhancing medical data collection to maximize claims reimbursement revenue and improve care quality. It’s considered to be a fundamental cornerstone for data quality, accurate reporting, fraud reduction, and robust public health information tracking.  And organizations like the College of Healthcare Information Management Executives (CHIME) and the Healthcare Information and Management Systems Society (HIMSS) continue to be focused on achieving better health care through information and technology.

Retail

We all have a digital footprint and believe it or not, almost everything we do online can be analyzed, quantified and used to help track consumer trends and behaviors and develop insights that help retailers reach out to us on an engaging, personal level. By understanding big data-based insights on customer habits, retailers can understand which of their products and services are most in-demand and which ones they should potentially stop offering. This not only helps reduce overhead, but guides retailers where to place investment and helps them give the consumer exactly what they want. Trend forecasting algorithms in big data are also helping retailers make key market predictions and forecast consumer trends. By gaining access to insights on real-time customer transactions, retailers get a better understanding which prices yield the best results on particular products. They are also better able to manage their supply-chain logistics. Retail giant Walmart has reaped the rewards of real-time merchandising. Big data technology can also be utilized for “markdown optimization”, an understanding of when prices on particular items should be dropped.

More Data = More Challenges

While these trends reflect advances in data analytics and remarkable growth in adoption, they also mean greater challenges for development teams responsible for managing applications, scheduling jobs,  and driving analytics initiatives for their organizations. Developers need to clearly understand the performance metrics of their applications to ensure SLAs, avoid failure, improve efficiency, and monitor resource capacity.

Pepperdata Application Spotlight™ is a self-service APM solution that provides developers with a holistic and real-time view of their applications in the context of the entire big data cluster, allowing them to quickly identify and fix problems (failed Spark applications, for instance) to improve application runtime, predictability, performance and efficiency. Application Spotlight also provides automatic tuning for applications, delivers job-specific recommendations, and enables users to set up alerts on specific behaviors and outcomes to avoid the risk of failure.