Manifest Insights: A Single Pane of Glass for your Data

manifest

Manifest Insights is an exciting new Startup from Portland. “We are a data consulting and visualization company. We help companies gather together data from all the different sources where it may be and bring it together in a both easy to use and powerful dashboard, where they can slice and dice and view the data.”

Data Science 101: SparkR – Interactive R Programs at Scale

R + RDD = R2D2

R is a widely used statistical programming language but its interactive use is typically limited to a single machine. To enable large scale data analysis from R, SparkR was announced earlier this year in a blog post. SparkR is an open source R package developed at U.C. Berkeley AMPLab that allows data scientists to analyze large data sets and interactively run jobs on them from the R shell.

MapR Partners with Tata Consultancy Services to Help Customers with Big Data

MapR Logo - New 2014_FEATURE

Tata Consultancy Services (TCS), (BSE: 532540, NSE: TCS), a leading IT services, consulting and business solutions organization, has announced a new partnership with MapR Technologies, Inc., provider of the highly ranked distribution for Apache™ Hadoop®, to help enterprise customers easily and rapidly capture critical big data insights.

Data Science 101: Real-time Analytics using Cassandra, Spark and Shark

In the video below, Evan Chan (Software Engineer at Ooyala), describes his experience using the Spark and Shark frameworks for running real-time queries on top of Cassandra data.

Project Adam: a New Deep-Learning System

Project_Adam

Project Adam is a new deep-learning system modeled after the human brain that has greater image classification accuracy and is 50 times faster than other systems in the industry. Project Adam is an initiative by Microsoft researchers and engineers that aims to demonstrate that large-scale, commodity distributed systems can train huge deep neural networks effectively.

The Putnam Mathematical Competition’s Unsolved Problem

Math_blackboard

As a data scientist with my roots in the theoretical foundations of the field, I’m always looking for ways to challenge myself and pick up a new mathematical apparatus that could help me in my project work.

Interview: Dolphin Speeds Business with Data Volume Management for SAP HANA

Dr. Werner Hopf

“Dolphin helps companies manage data volume and optimize processes so they can balance the performance and processing capabilities of SAP systems against the cost of running those systems. We develop a data volume management strategy so our customers can keep business critical data in SAP HANA, to get the fast efficient processing they need, and move static or business complete data on to other storage where it is still accessible. With a data volume management strategy in place, our customers are better prepared to go live on HANA and improve their return on investment.”

Where There’s Spark There’s Fire: The State of Apache Spark in 2014

Matei Zaharia, CTO of Databricks and Creator of Apache Spark

In this special guest feature, Matei Zaharia, CTO of Databricks and Creator of Apache Spark, explores open-source Apache Spark ‘s status in the Hadoop community.

Book Reviews: The Bootstrap Resampling Technique

Bootstrap

In the spirit of the importance of bootstrap methods to contemporary machine learning, I’d like to review several prominent books on the subject. Some of the titles are relatively new, while others can be considered “classics.”

Industry Perspectives from the 2013 O’Reilly Strata + Hadoop World Conference

The O’Reilly Strata + Hadoop World Conference is one of a few conferences that seriously can deliver on the mission of providing a state-of-the-art perspective on the big data industry. Here is a selection of video presentations made by industry luminaries that can guide enterprise thought leaders.