ESG Lab Validation Report – NetApp Open Soution for Hadoop


Learn why NetApp Open Solution for Hadoop is better than clusters built on commodity storage. This ESG lab report details the reasons why NetApp’s use of direct attached storage for Hadoop improves performance, scalability and availability compared to typical internal hard drive Hadoop deployments.

$16.1 Billion Big Data Market: 2014 Predictions From IDC And IIA


As 2013 draws to a close, it is time for industry analysts and pundits to present their assessments of the year in order to predict where Big Data is headed in 2014. Forbes recently came out with a useful summary of predictions from IDC and IIA which could serve as a balanced road map for […]

Machine Learning Research at


I’d like to acquaint you with a tremendous resource for keeping current with the latest research in the field of machine learning. Informally known as the pre-print server, is the global repository for the fields of Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics.

Free 2013 Data Miner Survey


As 2013 draws to a close, a number of year-end surveys are coming out to assess the progress in our industry. I enjoy going through these results to get a pulse of big data and how it’s being received in the business community. Here is one valuable survey published annually for free: Rexer Analytics 6th Data Miner Survey for 2013.

Meet the Researcher: Yann LeCun


On Tuesday Facebook announced it hired machine learning pioneer Yann LeCun to run its newly created artificial intelligence lab. Scooping up one of the biggest names in the field is a major move for the company, but it’s not a surprising one. If anything, Facebook is late to enter to the data science arms race that’s underway in Silicon Valley and the country as a whole.

The State of Big Data: What the Surveys Say


As a data scientist, I should believe in the value of surveys and other data collection mechanisms. In the case IT industry surveys, I’m not convinced how accurately the respondents report their reality while rushing through online surveys. So taking the results with a grain of salt, I found an intriguing article appearing in Forbes: The State of Big Data: What The Surveys Say.

SciDB – How Linear Algebra Operations Scale


I found a very interesting technical report that shows SciDB’s usefulness for machine learning applications: “SciDB – How Linear Algebra Operations Scale.”

Paper Shows Big Data Fostering Serendipity


Can Big Data be used to foster serendipity? That’s the premise of an award-winning paper in the 2013 Semantic Web Challenge. Entitled “Fostering Serendipity through Big Linked Data” the paper was written by Muhammad Saleem, Maulik R. Kamdar, Aftab Iqbal, Shanmukha Sampath, Helena F. Deus and Axel-Cyrille Ngonga Ngomo. The amount of bio-medical data available […]

Latest Trends in High Performance Computing Usage and Spending Identified by IDC


The latest International Data Corporation (IDC) worldwide study of high performance computing (HPC) end-user sites, is now available. The 2013 study included sites representing 905 HPC systems, nearly double the 488 systems profiled in the previous version of the study.

RESEARCH: MLI – An API for Distributed Machine Learning


MLI is an Application Programming Interface (API) designed to address the challenges of building machine learning algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. A new research paper is available on the arXiv pre-print server which describes the new API. MLI […]