Performance Optimization of Hadoop Using InfiniBand RDMA

DK Panda

“The Hadoop framework has become the most popular open-source solution for Big Data processing. Traditionally, Hadoop communication calls are implemented over sockets and do not deliver best performance on modern clusters with high-performance interconnects. This talk will examine opportunities and challenges in optimizing performance of Hadoop with Remote DMA (RDMA) support, as available with InfiniBand, RoCE (RDMA over Converged Enhanced Ethernet) and other modern interconnects.”

New Survey from GE and Accenture Finds Growing Urgency for Big Data Analytics

big-data_logo

A new global study, “Industrial Internet Insights for 2015,” from GE (NYSE: GE) and Accenture (NYSE:ACN) reveals there is a growing urgency for organizations to embrace big data analytics to advance their Industrial Internet strategy.

Apache Spark Beats the World Record for Fastest Processing of Big Data

Databricks

Databricks, the company founded by the creators of popular open-source Big Data processing engine Apache Spark, announced today that it has broken the world record for the GraySort, a third-party, industry benchmarking competition for sorting large on-disk datasets.

LexisNexis Launches HPCC Systems® Developer Contest

New-LexisNexis_logo

LexisNexis® Risk Solutions has announced its inaugural HPCC Systems Developer Contest. Developers and other technical professionals have the opportunity to demonstrate how they leveraged HPCC Systems to solve either a Big Data or Complex Query problem.

Types of In-Memory Computing

insideBIGDATA_Guide_IMC

In this installment we’ll set the stage for in-memory computing technology in terms of its current state as well as its next stage of evolution. We’ll begin with a discussion of the capabilities of in-memory databases (IMDBs) and in-memory data grids (IMDGs), and show how they differ. We’ll finish up the section by demonstrating how neither one is sufficient for a company’s strategic move to IMC; instead, we will explain why a comprehensive in-memory data platform is needed.

Predictive Modeling and Production Deployment

insideBIGDATA_Guide_PA

Using predictive analytics involves understanding and preparing the data, defining the predictive model, and following the predictive process. Predictive models can assume many shapes and sizes, depending on their complexity and the application for which they are designed. The first step is to understand what questions you are trying to answer for your organization.

Spark Panel Discussion with Cloudera, MapR & Pivotal

Spark_logo_feature

The panel discussion video below comes from the Los Angeles Spark Users Group. The talk fosters a lively discussion on Spark’s initial goals, where it came from and what the future holds for Spark. Many leading Big Data vendors are responding by introducing Spark’s capabilities into their architectures. The panel discussion is between the top Hadoop distribution vendors – Cloudera, MapR, and Pivotal.

Prelert Introduces Real-Time Analysis of Complex Anomalies in Big Data Sets

prelert_logo

Prelert, the anomaly detection company, has announced a new feature of its Anomaly Detective machine learning engine that enables multidimensional analysis to be conducted on large volumes of data at speeds never before possible. This new feature, Stats Reduce, dramatically shrinks data transfer sizes, making it possible to perform the complex behavioral analysis of terabytes of data per hour.

Alteryx Secures $60 Million in Funding for Data Blending and Advanced Analytics

alteryx-logo

Alteryx, Inc., the leader in data blending and advanced analytics, today announced a $60 Million investment led by Insight Venture Partners with participation from existing investors, SAP Ventures and Toba Capital. This new investment is in response to the significant growth Alteryx has experienced in the last year, with over 200 percent growth in its customer base.

Credit Scoring and Back Trading/Testing

Guide to Big Data Finance - Thumbnail

This article is the third in an editorial series that has the goal to provide direction for enterprise thought leaders on ways of leveraging big data technologies in support of analytics proficiencies designed to work more independently and effectively in today’s climate of working to increase the value of corporate data assets.