Sibyl: A System for Large Scale Machine Learning at Google

Sibyl-LogoSibyl is an important research project underway at Google that implements machine learning primitives at scale and is widely used within Google. Large scale machine learning is playing an increasingly important role in improving the quality and monetization of Internet properties. A small number of techniques, such as regression, have proven to be widely applicable across Internet properties and applications.

In the talk below, Tushar Chandra outlines Sibyl and the requirements that it places on Google’s computing infrastructure. Tushar Chandra is a Principal Engineer at Google Research and a co-lead for the Sibyl project. He received his Ph.D. in Computer Science from Cornell University in 1993, worked at IBM Research thereafter until he joined Google in 2004. He has worked on a number of distributed systems projects: Reliable Scalable Cluster Technology, Gryphon, and Oceano at IBM and Bigtable and a Paxos-based platform for fault-tolerance at Google. He was a joint winner of the 2010 Edsger W. Dijkstra Prize in Distributed Computing. Chandra’s other point of distinction is being one of the original developers of Google’s Big Table, the most widely used distributed storage service for big data applications in the cloud

Introduced by Georgia Tech professor and DSN 2014 General Chair, Dr. Doug Blough, this keynote was presented at the IEEE DSN (Dependable Systems and Networks) conference at the Georgia Tech Hotel & Conference Center in Atlanta, GA on Wednesday, June 25th 2014.

The slides for the presentation can be downloaded HERE. The seminal research paper referenced in the talk’s intro can be downloaded HERE: “Unreliable Failure Detectors for Reliable Distributed Systems,” by Chandra and Toueg.


Sign up for the free insideBIGDATA newsletter.

Resource Links: