Ask a Data Scientist: The Bias vs. Variance Tradeoff

Monica Martinez-Canales, PhD. Intel

This week’s Ask a Data Scientist question is from a reader who wants an explanation of the “bias vs. variance tradeoff in statistical learning.” Intel’s Dr. Monica Martinez-Canales is this week’s guest Data Scientist

Enterprise Grade Lustre in the Clouds

commercial lustre

With the release of Intel® Cloud Edition for Lustre software in collaboration with key cloud infrastructure providers like Amazon Web Services (AWS), commercial customers have an ideal opportunity to employ a production-ready version of Lustre—optimized for business HPDA—in a pay-as-you-go cloud environment.

Big Data for Finance – Security and Regulatory Compliance Considerations

Guide to Big Data Finance - Thumbnail

This article is the fifth and last in an editorial series that has the goal to provide direction for enterprise thought leaders on ways of leveraging big data technologies in support of analytics proficiencies designed to work more independently and effectively in today’s climate of working to increase the value of corporate data assets.

Ask a Data Scientist: Curse of Dimensionality

datascientist2_featured

Welcome back to our series of articles sponsored by Intel – “Ask a Data Scientist.” Once a week you’ll see reader submitted questions of varying levels of technical detail answered by a practicing data scientist – sometimes by me and other times by an Intel data scientist. This week’s question is from a reader who wants to know more about the “curse of dimensionality.”

The Analytics Frontier of the Hadoop Eco-System

Ted Wilkie

“The Hadoop MapReduce framework grew out of an effort to make it easy to express and parallelize simple computations that were routinely performed at Google. It wasn’t long before libraries, like Apache Mahout, were developed to enable matrix factorization, clustering, regression, and other more complex analyses on Hadoop. Now, many of these libraries and their workloads are migrating to Apache Spark because it supports a wider class of applications than MapReduce and is more appropriate for iterative algorithms, interactive processing, and streaming applications.”

Adopting Big Data for Finance

Guide to Big Data Finance - Thumbnail

This article is the fourth in an editorial series that has the goal to provide direction for enterprise thought leaders on ways of leveraging big data technologies in support of analytics proficiencies designed to work more independently and effectively in today’s climate of working to increase the value of corporate data assets.

Lustre 101

lustre logo

This week’s lustre 101 article looks at the history of lustre and the typical configuration of this high-performance scalable storage solution for big data applications.

Credit Scoring and Back Trading/Testing

Guide to Big Data Finance - Thumbnail

This article is the third in an editorial series that has the goal to provide direction for enterprise thought leaders on ways of leveraging big data technologies in support of analytics proficiencies designed to work more independently and effectively in today’s climate of working to increase the value of corporate data assets.

Interview: Replacing HDFS with Lustre for Maximum Performance

Gabriele Paciucci

“When organizations operate both Lustre and Apache Hadoop within a shared HPC infrastructure, there is a compelling use case for using Lustre as the file system for Hadoop analytics, as well as HPC storage. Intel Enterprise Edition for Lustre includes an Intel-developed adapter which allows users to run MapReduce applications directly on Lustre. This optimizes the performance of MapReduce operations while delivering faster, more scalable, and easier to manage storage.”

Intel Steps up with Enterprise Edition for Lustre Software

Brent Gorda

In this video from the 2014 Lustre Administrators and Developers Conference, Brent Gorda from Intel describes how the company is adding enterprise features to the Lustre File System.