A Mathematical Model for Murder


A recent paper published in PLos ONE, two UC Irvine mathematicians, Dominik Wodarz and Natalia Komarova, describe an elegant mathematical model to answer just these questions.

How Much to Raise Using Crunchbase Data


Raising capital for a shiny new start-up company is a daunting task what with shoring up interest by funding sources like Angels and VCs, producing a compelling “pitch deck,” and stacking your management team with the right people. But the big elephant in the room is always – how much to raise? Entrepreneur Jamie Davidson recently put some science (data science that is) behind this very important question.

Predicting the Popularity of a Tweet


As social media becomes increasingly important as a data source for the purposes of machine learning, finding a brand new method for analyzing the Twitter microblogging platform is very compelling. Tauid Zaman, assistant professor at MIT’s Sloan School of Management, developed a probabilistic model for the spread of an individual tweet in the twitterverse.

NetApp – Forrester Total Economic Impact Study


NetApp commissioned Forrester Consulting to examine the total economic impact and potential return on investment (ROI) enterprises may realize by deploying the NetApp Distributed Content Repository solution running StorageGRID software with E-Series hardware.

Enterprise Strategy Group Evaluates Hadapt

Enterprise Strategy Group (ESG), a leading analyst firm, recently performed a hands-on evaluation of Hadapt Adaptive Analytical Platform for big data.

Information Visualization


Information visualization is an increasingly important element of big data as it is the technology best able to convey the message emanating from the data. Here is a nice paper “Infovis and Statistical Graphics: Different Goals, Different Looks” (pdf) by Andrew Gelman (Professor of Statistics at Columbia University) and Antony Unwin that discusses the topic of information visualization.

ESG Lab Validation Report – NetApp Open Soution for Hadoop


Learn why NetApp Open Solution for Hadoop is better than clusters built on commodity storage. This ESG lab report details the reasons why NetApp’s use of direct attached storage for Hadoop improves performance, scalability and availability compared to typical internal hard drive Hadoop deployments.

$16.1 Billion Big Data Market: 2014 Predictions From IDC And IIA


As 2013 draws to a close, it is time for industry analysts and pundits to present their assessments of the year in order to predict where Big Data is headed in 2014. Forbes recently came out with a useful summary of predictions from IDC and IIA which could serve as a balanced road map for […]

Machine Learning Research at arXiv.org


I’d like to acquaint you with a tremendous resource for keeping current with the latest research in the field of machine learning. Informally known as the pre-print server, arXix.org is the global repository for the fields of Physics, Mathematics, Computer Science, Quantitative Biology, Quantitative Finance and Statistics.

Free 2013 Data Miner Survey


As 2013 draws to a close, a number of year-end surveys are coming out to assess the progress in our industry. I enjoy going through these results to get a pulse of big data and how it’s being received in the business community. Here is one valuable survey published annually for free: Rexer Analytics 6th Data Miner Survey for 2013.