Sign up for our newsletter and get the latest big data news and analysis.

Big Data Humor: Apollo Blues

Humor_walk_world

A statistical tribute to our waning group of brave Apollo astronauts!     Sign up for the free insideBIGDATA newsletter.  

New Market Dynamics Report: HPC Life Sciences

HPC Life Sciences

Scientific research in the life sciences is often akin to searching for needles in haystacks. Finding the one protein, chemical, or genome that behaves or responds in the way the scientist is looking for is the key to the discovery process. For decades, high performance computing (HPC) systems have accelerated this process, often by helping to identify and eliminate in feasible targets sooner.

Data Science 101: k-means Clustering

In this edition of insideBIGDATA’s Data Science 101 series, I’m going to offer up a short instructional video describing the use of the popular unsupervised learning algorithm, k-means clustering.

Visualization of the Week: Coachella by the Numbers

Coachella

One of the annual extravaganzas in Southern California is the Coachella Valley Music and Arts Festival which starts this weekend and continues onto next weekend.

Rubicon.IO Uses Riak to Provide Real-Time Threat Analysis

An example of the Rubicon User Interface

Rubicon.IO is a start-up in the threat intelligence space that real-time analytic capabilities by scouring metadata from various sources: threat feeds, social media, SIEM data, and PCAPs. It uses an HPC engine that aggregates and humanizes geospatial, TECHINT, HUMINT, and OSINT data sources.

Data Science 101: The Data Analytics Handbook

“Data Analytics Handbook” is a new resource meant to inform young professionals about the field of data science. Written by a group of students at UC Berkeley: Brian Liou, Tristan Tao, and Elizabeth Lin. Edition One of the book includes in-depth interviews with Data Scientists & Data Analysts.

MapR Adds Complete Apache Spark Stack to its Distribution for Hadoop

MapR Technologies, Inc., provider of a leading distribution for Apache Hadoop, today announced a strategic partnership with Databricks and the addition of the complete Apache Spark technology stack to the MapR Distribution.

Interview: Continuent Manages Cross-Database, High-Speed Replication

picture-150

“The speed and flexibility of our core replicator solution and the companion Continuent Tungsten clustering solution offer advanced functionality in a simple and easily usable format. Tungsten Replicator supports high-speed replication between MySQL and Oracle databases in an open source product. Continuent Tungsten supports billions of transactions a day, with our largest single installation managing over 700 million transactions a day and over 225 terabytes of data. Key to all this is the ease of deployment and use, and the flexible nature of the solution, enabling cross-database replication, and advanced filtering not found in other products.”

Data Science 101: Combining the Power of R and Hadoop

Cloudera and Revolution Analytics allow you to derive new business insights from Big Data by providing a joint solution to store, process, and analyze all your data at scale.

NoSQL Database to Power Big Data App for Macmillan Education Australia

macmillan_sqrrl_logos

Sqrrl, the company that develops secure NoSQL database software for Big Data applications, has collaborated with Macmillan Education Australia, a leading educational publisher, to help them power a next generation education portal. Sqrrl’s NoSQL database, Sqrrl Enterprise, enables Macmillan to securely store massive amounts of student and teacher data and ensures data is only accessed in authorized ways.