Hadoop Buyers Guide

Get your complimentary copy of the Hadoop Buyer’s Guide, from Robert D. Schneider, the author of Hadoop for Dummies.

Data Science 101: Interview with John Chambers

The father of the S language which ultimately became R, Dr. John Chambers, sits down with Professor Trevor Hastie of the Stanford University Statistics Department to discuss the long and fascinating history of the R language.

Johns Hopkins Data Science Specialization


The quantity and quality of data science education resources just took a step forward with the announcement of the new Johns Hopkins University Data Science Specialization series on the Coursera platform. The series consists of 9 free courses. The first course starts in April 2014.

The Future of Computer Science


I am convinced we’re at an important inflection point in the timeline of the discipline of computer science. When compared to other disciplines like mathematics, physics and biology, computer science is a very young field, starting around 1964. But something is happening now, in 2014, that is propelling the field into a new evolutionary period.

Data Science 101: Machine Learning, Part 5

The “How Machine Learning Works” lecture series concludes by developing some machine learning python code from scratch. We use real valued numbers sampled from two different Gaussians with different priors.

Machine Learning for Display Advertising

Machine learning technologies have seen many inroads into the advertising industry primarily to make for more intelligent buys and placements in order to deliver a brand message to a selected audience. Here are some compelling SLIDES from a lecture at the New York University Stern School of Business by Foster Provost, Professor of Information Systems: “Machine Learning for Display Advertising.”

Building a Production Machine Learning Infrastructure

The recent Data Day Texas 2014 featured a session given by Josh Wills, Cloudera’s Senior Director of Data Science: “From The Lab To The Factory: Building A Production Machine Learning Infrastructure.”

Data Science 101: Machine Learning, Part 4

The “How Machine Learning Works” lecture series continues by building on top of the Bayesian classifier developed in Part 3 of the series. We’ll build an expectation-maximization (EM) algorithm that locally maximizes the likelihood function.

Stanford Statistical Learning


The new StatLearning course from Stanford University begins today. The free massively open online course (MOOC) is an excellent way to get up to speed with state-of-the-art machine learning by two of the foremost experts in the field: professors Trevor Hastie and Robert Tibshirani.

Data Science 101: Machine Learning, Part 3

The “How Machine Learning Works” lecture series continues to build on Bayes rule that was taught last time. We’ll define training and testing data sets and build a Bayesian classifier.