FlexPod Select with Hadoop


FlexPod Select with Hadoop delivers enterprise class Hadoop with validated, pre-configured components for fast deployment, higher reliability and smoother integration with existing applications and infrastructure. These technical reference architectures optimize storage, networking, and servers with Cloudera and Hortonworks distributions of Hadoop.

A Mathematical Model for Murder


A recent paper published in PLos ONE, two UC Irvine mathematicians, Dominik Wodarz and Natalia Komarova, describe an elegant mathematical model to answer just these questions.

How Much to Raise Using Crunchbase Data


Raising capital for a shiny new start-up company is a daunting task what with shoring up interest by funding sources like Angels and VCs, producing a compelling “pitch deck,” and stacking your management team with the right people. But the big elephant in the room is always – how much to raise? Entrepreneur Jamie Davidson recently put some science (data science that is) behind this very important question.

Predicting the Popularity of a Tweet


As social media becomes increasingly important as a data source for the purposes of machine learning, finding a brand new method for analyzing the Twitter microblogging platform is very compelling. Tauid Zaman, assistant professor at MIT’s Sloan School of Management, developed a probabilistic model for the spread of an individual tweet in the twitterverse.

NetApp – Forrester Total Economic Impact Study


NetApp commissioned Forrester Consulting to examine the total economic impact and potential return on investment (ROI) enterprises may realize by deploying the NetApp Distributed Content Repository solution running StorageGRID software with E-Series hardware.

Enterprise Strategy Group Evaluates Hadapt

Enterprise Strategy Group (ESG), a leading analyst firm, recently performed a hands-on evaluation of Hadapt Adaptive Analytical Platform for big data.

Information Visualization


Information visualization is an increasingly important element of big data as it is the technology best able to convey the message emanating from the data. Here is a nice paper “Infovis and Statistical Graphics: Different Goals, Different Looks” (pdf) by Andrew Gelman (Professor of Statistics at Columbia University) and Antony Unwin that discusses the topic of information visualization.

ESG Lab Validation Report – NetApp Open Soution for Hadoop


Learn why NetApp Open Solution for Hadoop is better than clusters built on commodity storage. This ESG lab report details the reasons why NetApp’s use of direct attached storage for Hadoop improves performance, scalability and availability compared to typical internal hard drive Hadoop deployments.

$16.1 Billion Big Data Market: 2014 Predictions From IDC And IIA


As 2013 draws to a close, it is time for industry analysts and pundits to present their assessments of the year in order to predict where Big Data is headed in 2014. Forbes recently came out with a useful summary of predictions from IDC and IIA which could serve as a balanced road map for […]

Machine Learning Research at arXiv.org


I’d like to acquaint you with a tremendous resource for keeping current with the latest research in the field of machine learning. Informally known as the pre-print server, arXix.org is the global repository for the fields of Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics.