How Companies are Using Spark


The video below comes to us from the Strata Conference 2014: How Companies are Using Spark, and Where the Edge in Big Data Will Be. While the first big data systems made a new class of applications possible, organizations must now compete on the speed and sophistication with which they can draw value from data. […]

Strata Conference 2014 Slides and Videos

Last week saw evidence for the big data industry steamroller effect as the Strata Conference 2014 in Santa Clara came and went. With thousands of attendees, an abundance of informative presentations, and a very healthy exhibitor ecosystem, the show defined the current state-of-the-art for all that is big data. If you missed the big event, O’Reilly Media has graciously made available the slides and videos for some of the presentations.

Interview: Inktank Joins Forces with Open Source Mainstays Red Hat and OpenStack

Ross Turk

Intank Ceph, the open source software-defined storage system, has expanded its offerings and its customer base by supporting new Red Hat and OpenStack products. To get the specifics, we caught up with Ross Turk, Vice President of Community at Inktank.

Hadoop Buyers Guide

Get your complimentary copy of the Hadoop Buyer’s Guide, from Robert D. Schneider, the author of Hadoop for Dummies.

Data Science 101: Interview with John Chambers

The father of the S language which ultimately became R, Dr. John Chambers, sits down with Professor Trevor Hastie of the Stanford University Statistics Department to discuss the long and fascinating history of the R language.

Johns Hopkins Data Science Specialization


The quantity and quality of data science education resources just took a step forward with the announcement of the new Johns Hopkins University Data Science Specialization series on the Coursera platform. The series consists of 9 free courses. The first course starts in April 2014.

The Future of Computer Science


I am convinced we’re at an important inflection point in the timeline of the discipline of computer science. When compared to other disciplines like mathematics, physics and biology, computer science is a very young field, starting around 1964. But something is happening now, in 2014, that is propelling the field into a new evolutionary period.

Data Science 101: Machine Learning, Part 5

The “How Machine Learning Works” lecture series concludes by developing some machine learning python code from scratch. We use real valued numbers sampled from two different Gaussians with different priors.

Machine Learning for Display Advertising

Machine learning technologies have seen many inroads into the advertising industry primarily to make for more intelligent buys and placements in order to deliver a brand message to a selected audience. Here are some compelling SLIDES from a lecture at the New York University Stern School of Business by Foster Provost, Professor of Information Systems: “Machine Learning for Display Advertising.”

Building a Production Machine Learning Infrastructure

The recent Data Day Texas 2014 featured a session given by Josh Wills, Cloudera’s Senior Director of Data Science: “From The Lab To The Factory: Building A Production Machine Learning Infrastructure.”