Wondering what to cook for Thanksgiving? This is a question asked all across the United States this time of year, but the answer often depends on where you live. Big data analysts at AllRecipes.com and Tableau Software worked to parse over 78 million recipe page views stretching from last Thanksgiving to uncover most popular foods by state.
Kaggle competitions using machine learning techniques have become the fascination of data scientists worldwide. In the video below, Jeremy Howard presents to the Melbourne R meetup group, where he gave a brief overview of his Data Scientist’s Toolbox (using a few Kaggle competitions as practical examples).
One of the closely-watched conundrums of online education is how best to approximate traditional on-premise teaching methods. Grading multiple choice and short answer exam questions is straightforward, but how do you approach grading student submitted essays in a way to allow the online platform to scale to handle tens of thousands of students? Enter machine learning.
I ran across a Tweet recently that pointed me to a discussion over on Professor Andrew Gelman’s blog, “Statistics is the least important part of data science.” Dr. Gelman is a Professor of Statistics and Political Science at Columbia University and prior Ph.D. adviser of Rachel Schutt, author of Doing Data Science which I reviewed earlier this month.
A very important technique in unsupervised machine learning as well as dimensionality reduction is Principal Component Analysis (PCA). But PCA is difficult to understand without the fundamental mathematical underpinnings. The two instructional videos below (Part 1 and 2) demonstrate PCA at an introductory level to provide an appreciation for this powerful tool used in big data applications.
Apparently the MOOC (Massive Open Online Course) ecosystem is practicing what they preach. As part of the Artificial Intelligence in Education (AIED 2013) conference this past July, a special workshop was held – the “moocshop” – that included representatives from many of the high-flying MOOCs, Coursera, Edx, and researchers from top universities.
AMT is a clearinghouse for performing Human Intelligence Tasks, i.e. things best done my humans equipped with the most powerful computer of all – the brain. AMT is a facility used by machine learning developers to get results from Mechanical Turk workers, and allows average human computers (you and me) to earn a small stipend for each classification completed.