What is Scalability in Cloud Computing

What is Scalability in Cloud Computing

Our previous post, Empower Shared Services Chargeback Models to Generate Better Business Outcomes, submits that deep, timely, accurate, and accessible platform usage data empowers shared services chargeback models to generate better business outcomes. This post goes a level deeper with regard to cost controls and data placement. It describes the challenges of multi-temperature data management as it applies to financial services and how to take the simple next step to reviewing your data temperature today.

What is Data Temperature?

Data temperature is an expressive way to describe how often data is used. Hot data is accessed most frequently and cold data is seldom used, while warm and cool data is the range in between. Fast data solutions are expensive, so priorities need to be set across enterprise-wide big data platforms based on how hot or cold the data is. The purpose of multi-temperature data management is to ensure the most cost-effective allocation of primary storage (fast) and secondary storage (not so fast).

Hot data in financial services requires low latency network and storage solutions (e.g.: in-memory computing and time series databases). This is a critical component of algorithmic (algo) trading in capital markets, when trading decisions are calculated in real time on streaming market data. Hot data is also needed in personal banking scenarios, like payments decisioning to mitigate fraud without inconveniencing the customer.

On the other hand, end-of-month treasury reporting is a cooler data access scenario, while the requirements to archive seven years of customer data gets pretty cold.

A holistic view that considers a variety of architectures, like microservices, in-memory computing, and HDFS storage best practices, is needed but those categories can’t be compared, apples to apples. Effective multi-temperature data management is only possible if you have detailed knowledge about how data is used across your enterprise. That in turn requires deep, timely, and accurate platform utilization data that is accessible by decision makers.

The Challenges of Multi-Temperature Data Management

It is pretty straight forward to allocate hot data to primary storage and cold data to secondary storage. The challenge of evaluating data storage economics comes when functions overlap and when warm and cool temperatures change abruptly or over time.

In the algo trading example above, not all data an algo uses is necessarily hot. Decisioning on real-time market conditions could refer to an economic data model that is rarely used elsewhere. This is a functional overlap because the latter is cold but it is still important. Identifying a high value application is not enough because, in this case, both hot and cool data are used.

With regard to changing data temperatures, consider an economic crisis. High priority data used to generate revenue can temporarily become secondary to mitigating risk. What was cool yesterday is hot today, but it will revert back when the economy stabilizes again.

To facilitate quick and good decision making, you need a complete and current picture of how data is being utilized across your big data platform. Below, I refer to Pepperdata