In the presentation below, courtesy of the SF Machine Learning Meetup group in San Francisco, Xiangrui Meng introduces Spark and show how to use it to build fast, end-to-end machine learning workflows.
IBM (NYSE: IBM) has announced the availability of a cognitive-infused Watson Explorer, a powerful combination of data exploration and content analytics capabilities. Watson Explorer equips users with the information and analytics capabilities which can help them to deliver better performance and real-time results.
In this presentation and interactive demo, you’ll learn about data mining workflows, the architecture and benefits of Spark, as well as practical use cases for the framework.
From the SciPy2013 conference, here is a compelling talk “Data Agnosticism: Feature Engineering Without Domain Expertise” by Nicholas Kridler of Accretive Health in Chicago.
In the presentation below, Hadoop luminary Doug Cutting gives us some of his perspectives on the big data industry as well as a high-level overview of the Hadoop technology stack.
“The Hadoop MapReduce framework grew out of an effort to make it easy to express and parallelize simple computations that were routinely performed at Google. It wasn’t long before libraries, like Apache Mahout, were developed to enable matrix factorization, clustering, regression, and other more complex analyses on Hadoop. Now, many of these libraries and their workloads are migrating to Apache Spark because it supports a wider class of applications than MapReduce and is more appropriate for iterative algorithms, interactive processing, and streaming applications.”
“In this talk we summarize the results of the BIG project including analysis of foundational Big Data research technologies, technology and strategy roadmaps to enable business to understand the potential of Big Data technologies across different sectors, together with the necessary collaboration and dissemination infrastructure to link technology suppliers, integrators and leading user organizations.”