Hadoop YARN (Yet Another Resource Negotiator) is a resource-management platform responsible for managing compute resources in clusters and using them for scheduling of user applications. YARN was added as part of Hadoop 2.0. Over the past several months of going to conferences like Hadoop Summit, attending big data Meetup groups like LA Big Data Users […]
For this segment of insideBIGDATA Data Science 101, we have a very compelling Google Tech Talk “Building Brains to Understand the World’s Data” presented by Jeff Hawkins, co-founder of Numenta and who also founded Palm and Handspring.
In the video presentation below, industry luminary John Chambers makes a keynote presentation at the recent useR! 2014 conference, Interfaces, Efficiency and Big Data.
Coming to us from the recent Spark Summit 2014, here is a compelling presentation by Databricks CEO Ion Stoica that sets the stage for Spark’s continued advance in the big data ecosystem. The Databricks Cloud provides the full power of Spark, in the cloud, plus a powerful set of features for exploring and visualization your data, as well as writing and deploying production data products.
Here is a topic that receives much debate these days – as diverse fields like statistics, computer science and applied mathematics converge with newly named fields such as data science and big data. Can’t we all get along?
R is a widely used statistical programming language but its interactive use is typically limited to a single machine. To enable large scale data analysis from R, SparkR was announced earlier this year in a blog post. SparkR is an open source R package developed at U.C. Berkeley AMPLab that allows data scientists to analyze large data sets and interactively run jobs on them from the R shell.
In the video below, Evan Chan (Software Engineer at Ooyala), describes his experience using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.
Project Adam is a new deep-learning system modeled after the human brain that has greater image classification accuracy and is 50 times faster than other systems in the industry. Project Adam is an initiative by Microsoft researchers and engineers that aims to demonstrate that large-scale, commodity distributed systems can train huge deep neural networks effectively.