Walking Then Running

In this special guest feature, Jesse Anderson from Cloudera writes about how many new companies, like the ones we see popping up in the Hadoop ecosystem, too quickly move from crawling to running, a process that sometimes leads to failure.

Big Data Humor: A Poor Man’s Recommender System

Humor_recommender

A recommendation engine for the rest of us:   Sign up for the free insideBIGDATA newsletter.

Big Data Humor: Captain Statistics Saves the Day

Humor_Captain_statistics

For those true Data Scientists out there, here is the latest from your favorite super-hero and mine: Captain Statistics!     Sign up for the free insideBIGDATA newsletter.

Big Data Humor: Fault Tolerance

humor_mapreduce

Big Data — creating a whole new world order!   Sign up for the free insideBIGDATA newsletter.

Data Science vs. Statistics – One in the Same?

I recently ran across a thought-provoking post on the USC Anneberg Innovation Lab blog – “Why Do We Need Data Science when We’ve Had Statistics for Centuries.” With all the debate of late surrounding the relatively new “data science” term, I’ve been thinking a lot about this question, so I thought I’d analyze this notion […]

Big Data Humor: Hard Facts!

humor_3dprinter

And here is a view of visualizations on steroids; try plopping this down on the boardroom table!   Sign up for the free insideBIGDATA newsletter.    

Big Data Humor: Unicorns Unite!

Humor_salary_negotiation

Unicorns finally standing up! If you expect ONE person to be a data science “team” then ya gotta cough up the extra $!

Stephen Hawking: Machine Learning is Scary

Skynet

An eye-catching piece appearing in today’s edition of The Independent featured the thoughts of luminaries from the scientific world – renowned physicist Stephen Hawking, U.C. Berkeley computer-science professor Stuart Russell, and MIT physics professors Max Tegmark and Frank Wilczek – about the potential perils of artificial intelligence.

Machine Learning Identifies Fake Research Papers

Fake_paper_clusters

Unsupervised machine learning techniques have proven useful in identifying fake research papers submitted to the arXiv preprint server. Approximately 500 preprints are receiving daily by the automated repository arXiv, but are not pre-screened by humans. As a result, many nonsense papers generated by software such as SCIgen and Mathgen have been found in the most popular repository used by scientists to share research results.

Big Data Humor: Algorithms for Love & Safety

Humor_Crime

Any algorithm is as strong as its weakest feature variable!     Sign up for the free insideBIGDATA newsletter.