Six Key Takeaways from Spark Summit and DataWorks Summit

Posted by:

It’s been an extremely busy–but, very productive – couple of weeks. Pepperdata sponsored two critically important events attended by many of our clients and target customers.

Thousands of developers, data engineers, data scientist and business professionals attended each event. Spark Summit is dubbed the world’s largest event for the Apache Spark community, and DataWorks Summit is one of the industry’s premier big data community events. Combined, these events promote the opportunity to participate in hundreds of formal presentations as well as ...

Read More →
0

Solving Performance Bottlenecks for Developers—Introducing Pepperdata Code Analyzer for Spark

Posted by:

Two years ago, IBM made a $300 million commitment to Spark. The investment shows how Spark has grown from its humble beginnings at UC Berkeley as an AMPLab project to a production-grade open source project. More importantly, it demonstrates an urgency around Spark as it sees continued enterprise adoption.

I witnessed Spark’s momentum firsthand as one of its earliest adopters. I first came across Spark back in 2011 as a founder at ...

Read More →
0

The Big Data Insiders: Q & A with Ian O’Connell

Posted by:

As discussed in our last blog, Apache Spark has become the big thing in Big Data. As the largest open source community in Big Data, Spark enables flexible in-memory data processing and advanced analytics on the Hadoop platform. The roster of companies that have adopted Spark and also are project committers includes some of tech’s biggest names, such as Facebook, Netflix, Yahoo, IBM and Intel.

In the first installment of “The Big Data Insiders” — interviews with Big Data experts and ...

Read More →
0

Pepperdata Digs into Production Spark with the Experts

Posted by:

Apache Spark has quickly become an open source framework with widespread appeal for large-scale data processing. Last year, more than 1,000 organizations were using Spark in production. Many run Spark on clusters of thousands of nodes and up to petabytes of data, according to the project’s FAQ. Another indicator of its growth: according to Indeed, Spark-related job postings are up more than 10x over the past two years.

But beyond the impressive numbers, what users say about production Spark ...

Read More →
0

Performance is the Difference Between Business Critical and Business Useless – Introducing Pepperdata Application Profiler

Posted by:

Big Data systems by their very nature are complex entities. This is because everything about them is super-sized: the data volume, the number of machines, and the velocity of the data. As a result, building production Big Data systems is difficult. It requires developing new intuition about both development and operations.  And, it requires the realization that performance is paramount.  

Over the past several years, Pepperdata has been helping our customers integrate a deep understanding of ...

Read More →
0

Introducing the Pepperdata Technology Advisory Board

Posted by:

Today we announced our all-star Technology Advisory Board, an exclusive network of forward-thinking experts coming together to help shape the future direction of Pepperdata. The Board is made up of four industry and academic luminaries whom I greatly respect and admire. Our new board members not only understand Big Data, but also know how to build systems at scale.

Bringing backgrounds in designing, building and managing systems at scale for enterprises such as BlackRock, Credit Suisse, Stripe, Twitter and VMware, as ...

Read More →
0

We just made Amazon EMR up to 4x faster (for the same amount of money)

Posted by:

The popularity of Amazon Elastic MapReduce (EMR) has soared over the last couple of years thanks to its ability to simplify big data processing. EMR provides a managed Hadoop framework that makes it easy, fast, and cost-effective for customers to distribute and process vast amounts of data – but all good things can be improved upon.

Today we are announcing a new offering that enables customers using Amazon EMR to run jobs up to four times faster ...

Read More →
0

Get Control of Your Cluster

Posted by:

We’ve heard it time and time again: one of the biggest challenges for Hadoop operators is too much time spent troubleshooting. Whether it takes way too long to put out figurative fires on your Hadoop cluster, or you spend forever trying to determine the root cause of performance issues, or you can’t for the life of you work out why jobs are running slowly – troubleshooting is a constant drain on time and ...

Read More →
0

Join Comcast, Forrester, and others to learn how Big Data is transforming the face of Financial Services

Posted by:

Delivering superior customer service requires a colossal magnitude of data coupled with a vast array of internal and external data sources in order to support key initiatives like compliance reporting, security fraud detection, and “next action” scenarios for customer service. The most innovative leaders in financial services are seeking ways to leverage big data to identify new revenue streams and maintain crucial procedures, while also keeping the operational costs of distributed computing low. Pepperdata has worked with several Fortune ...

Read More →
0

Doctor’s Orders: Get a Hadoop Health Check and Stop Dealing with Cluster Flux Today!

Posted by:

As the experts in distributed computing performance, Pepperdata is keenly aware of the challenges all organizations face when running Hadoop in production. Getting your cluster to run optimally is no easy feat, which is why it’s imperative to understand the underlying cause of performance issues so that they can be corrected, or better yet, averted altogether.

Feeling fluxed?

If you’re wondering whether or not your organization is experiencing a bad case of what we like to call ...

Read More →
0
Page 1 of 6 12345...»