Data Science 101: Scalable Machine Learning with Apache Spark

In the presentation below, courtesy of the SF Machine Learning Meetup group in San Francisco, Xiangrui Meng introduces Spark and show how to use it to build fast, end-to-end machine learning workflows.

Data Science 101: Mining Big Data with Apache Spark

In this presentation and interactive demo, you’ll learn about data mining workflows, the architecture and benefits of Spark, as well as practical use cases for the framework.

Apache Spark Beats the World Record for Fastest Processing of Big Data

Databricks

Databricks, the company founded by the creators of popular open-source Big Data processing engine Apache Spark, announced today that it has broken the world record for the GraySort, a third-party, industry benchmarking competition for sorting large on-disk datasets.

Spark Panel Discussion with Cloudera, MapR & Pivotal

Spark_logo_feature

The panel discussion video below comes from the Los Angeles Spark Users Group. The talk fosters a lively discussion on Spark’s initial goals, where it came from and what the future holds for Spark. Many leading Big Data vendors are responding by introducing Spark’s capabilities into their architectures. The panel discussion is between the top Hadoop distribution vendors – Cloudera, MapR, and Pivotal.

Guavus And Databricks Announce Reflex Platform Now a Certified Spark Distribution

Guavus, Inc., a leading provider of big data analytics solutions for operational intelligence, has announced that its Reflex 2.0 platform has been designated a Certified Spark Distribution by Databricks, the company founded by the creators of Apache Spark.

Databricks and O’Reilly Media Launch Spark Developer Certification Program

Databricks, the company founded by the creators of the popular open-source Big Data processing engine Apache Spark, and O’Reilly Media , a leading voice in Data Science, has announced the launch of the first, global Apache Spark Developer Certification program.

An Exciting Year for Spark

Spark_logo_feature

Apache Spark has had an amazing year, and the people behind the open source large-scale data processing engine have pulled some data to show just how fast it has grown in the last 12 months. Databricks, who spun out of AMPlab at UC Berkeley after creating Spark produced the infographic below that highlights some of the data, especially as it catches fire across the industry.

Where There’s Spark There’s Fire: The State of Apache Spark in 2014

Matei Zaharia, CTO of Databricks and Creator of Apache Spark

In this special guest feature, Matei Zaharia, CTO of Databricks and Creator of Apache Spark, explores open-source Apache Spark ‘s status in the Hadoop community.

Alteryx and Databricks to Lead Development of Apache SparkR for Scalable Hadoop Analytics

alteryxlogo

Alteryx and Databricks has announced they are collaborating to drive the value of Apache Hadoop and Spark into the hands of everyday analysts. These companies will become the primary committers to SparkR, a subset of the overall Spark framework.

Databricks Unveils Spark-Based Cloud Platform To Simplify Big Data Processing

Databricks

Databricks, the company founded by the creators of Apache Spark—the powerful open-source processing engine that provides blazingly fast and sophisticated analytics—announced today the launch of Databricks Cloud, a cloud platform built around Apache Spark. In addition to this launch, the company is announcing the close of $33 million in series B funding led by New Enterprise Associates (NEA) with follow-on investment from Andreessen Horowitz.