The new StatLearning course from Stanford University begins today. The free massively open online course (MOOC) is an excellent way to get up to speed with state-of-the-art machine learning by two of the foremost experts in the field: professors Trevor Hastie and Robert Tibshirani.
“Big data techniques offer a way to analyze data pooled across many patients: their specific disease mutations, biological markers, the treatments, and outcomes — in order to identify unexpected ways that existing therapies can be applied and combined to create personalized treatments that dramatically improve the chances of survival.”
The “How Machine Learning Works” lecture series continues to build on Bayes rule that was taught last time. We’ll define training and testing data sets and build a Bayesian classifier.
“Map-D uses multiple NVIDIA GPUs to interactively query and visualize big data in real-time. Map-D is an SQL-enabled column store that generates 70-400X speedups over other in-memory databases. This talk discusses the basic architecture of the system, the advantages and challenges of running queries on the GPU, and the implications of interactive and real-time big data analysis in the social sciences and beyond.”
In this slidecast, Kevin Murray from IBM introduces the company’s new X6 line of servers. Able to accommodate solid-state Flash drives in their DIMM memory slots, the new systems are designed to deliver significant improvements in the performance and economics of x86-based systems for analytics and the cloud.
“Predictive lead targeting enables you to tap into the social conversations going on among individuals within your targeted companies, including job listings, news and more,” said Leadspace co-founder and VP Products Amnon Mishor. “Based on your Ideal Customer Profile, our automated scoring algorithm identifies the specific organizations that are likely the most open to hearing about your solution, thereby significantly increasing conversions.”
“IBM appears to be seeing about the same results as other companies pushing big data technology, such as Hadoop. There’s some money coming in, but it’s not yet a billion-dollar business. There’s potential for really big deals, but it probably means slogging through long proofs of concept and deployment cycles.”
The “How Machine Learning Works” lecture series continues by building on fundamental definitions of statistics. This is needed for any rigorous analysis of models or machine learning algorithms.
“With this release, C++ developers now can easily integrate ScaleOut’s IMDG into their applications to provide scalable performance, as well as parallel query and integrated real-time analytics for applications written in C++,” said Bill Bain, ScaleOut’s CEO. “The added capabilities in Version 5.1 also introduce significant enhancements to ScaleOut StateServer’s features and performance and broaden its availability in the cloud.”