Data Science 101: An Interview with Hadley Wickham

useR_logo

RStudio’s Chief Scientist Hadley Wickman was interviewed by DataScience.LA’s Eduardo Arino de la Rubia during the useR!2014 conference at UCLA this past July.

Nervana Systems Secures $3.3M in Series A Financing

Nervana Systems logo

Nervana Systems, Inc., has announced $3.3M in Series A financing led by DFJ. Steve Jurvetson, DFJ Partner, who currently serves on the boards of Tesla Motors, SpaceX, D-Wave and Synthetic Genomics, will join Nervana’s Board of Directors.

Sibyl: A System for Large Scale Machine Learning at Google

Sibyl-Logo

Sibyl is an important research project underway at Google that implements machine learning primitives at scale and is widely used within Google. Large scale machine learning is playing an increasingly important role in improving the quality and monetization of Internet properties.

Data Science 101: SparkR – Interactive R Programs at Scale

R + RDD = R2D2

R is a widely used statistical programming language but its interactive use is typically limited to a single machine. To enable large scale data analysis from R, SparkR was announced earlier this year in a blog post. SparkR is an open source R package developed at U.C. Berkeley AMPLab that allows data scientists to analyze large data sets and interactively run jobs on them from the R shell.

Project Adam: a New Deep-Learning System

Project_Adam

Project Adam is a new deep-learning system modeled after the human brain that has greater image classification accuracy and is 50 times faster than other systems in the industry. Project Adam is an initiative by Microsoft researchers and engineers that aims to demonstrate that large-scale, commodity distributed systems can train huge deep neural networks effectively.

The Putnam Mathematical Competition’s Unsolved Problem

Math_blackboard

As a data scientist with my roots in the theoretical foundations of the field, I’m always looking for ways to challenge myself and pick up a new mathematical apparatus that could help me in my project work.

Book Reviews: The Bootstrap Resampling Technique

Bootstrap

In the spirit of the importance of bootstrap methods to contemporary machine learning, I’d like to review several prominent books on the subject. Some of the titles are relatively new, while others can be considered “classics.”

Versium’s Predictive GivingScore Identifies High-Impact Donors

Versium, a data technology company that operates a LifeData® predictive analytics scoring service, today announced the launch of its Predictive GivingScore solution.

Weather Data Contest – Climate Crush

climate_science

If you’re a weather visionary and/or data scientist looking for a noble way to spend your summer, consider the Climate Crush – a weather data visualization contest with a cash prize. Weather Analytics opens its data and invites participants to create a visualization – infographic, dynamic interface, widget, app, anything – using the weather data (as well as any other open data sources).

The userR!2014 Conference in Review

useR_JohnChambers_summary

FIELD REPORT Last week I attended the long-anticipated useR!2014 international conference at the UCLA campus, my alma mater. The four day event had something for everyone in attendance – all the brain cycles centered around the use of the R statistical environment. Since R is a primary tool for my work in data science and […]