Sign up for our newsletter and get the latest big data news and analysis.

New Market Dynamics Report: HPC Life Sciences

HPC Life Sciences

Scientific research in the life sciences is often akin to searching for needles in haystacks. Finding the one protein, chemical, or genome that behaves or responds in the way the scientist is looking for is the key to the discovery process. For decades, high performance computing (HPC) systems have accelerated this process, often by helping to identify and eliminate in feasible targets sooner.

Data Science 101: k-means Clustering

In this edition of insideBIGDATA’s Data Science 101 series, I’m going to offer up a short instructional video describing the use of the popular unsupervised learning algorithm, k-means clustering.

Visualization of the Week: Coachella by the Numbers

Coachella

One of the annual extravaganzas in Southern California is the Coachella Valley Music and Arts Festival which starts this weekend and continues onto next weekend.

Rubicon.IO Uses Riak to Provide Real-Time Threat Analysis

An example of the Rubicon User Interface

Rubicon.IO is a start-up in the threat intelligence space that real-time analytic capabilities by scouring metadata from various sources: threat feeds, social media, SIEM data, and PCAPs. It uses an HPC engine that aggregates and humanizes geospatial, TECHINT, HUMINT, and OSINT data sources.

Data Science 101: The Data Analytics Handbook

“Data Analytics Handbook” is a new resource meant to inform young professionals about the field of data science. Written by a group of students at UC Berkeley: Brian Liou, Tristan Tao, and Elizabeth Lin. Edition One of the book includes in-depth interviews with Data Scientists & Data Analysts.

Netflix Reveals All (well, at least a lot)

netflixlogo

Last night I had the distinct pleasure of attending a Data Science Track event sponsored by the LA Machine Learning meetup group: Data Science @ Netflix.

MapR Adds Complete Apache Spark Stack to its Distribution for Hadoop

MapR Technologies, Inc., provider of a leading distribution for Apache Hadoop, today announced a strategic partnership with Databricks and the addition of the complete Apache Spark technology stack to the MapR Distribution.

Data Science 101: Combining the Power of R and Hadoop

Cloudera and Revolution Analytics allow you to derive new business insights from Big Data by providing a joint solution to store, process, and analyze all your data at scale.

NoSQL Database to Power Big Data App for Macmillan Education Australia

macmillan_sqrrl_logos

Sqrrl, the company that develops secure NoSQL database software for Big Data applications, has collaborated with Macmillan Education Australia, a leading educational publisher, to help them power a next generation education portal. Sqrrl’s NoSQL database, Sqrrl Enterprise, enables Macmillan to securely store massive amounts of student and teacher data and ensures data is only accessed in authorized ways.

MongoDB 2.6 Released – Builds on Five Years of Innovation

MongoDB today announced the general availability of MongoDB 2.6, the newest release of the popular database. The release builds on five years of innovation and hundreds of thousands of deployments to simplify provisioning and operating MongoDB deployments.