Discovering Gold with Big Data Analytics and Data-Intensive Computing

Entries filed under “Education”

Is All Science Becoming Data Science?

Over at Science Magazine, Vijaysree Venkatraman writes that data-driven discovery may soon become the norm in science and that learning to code and becoming comfortable with large datasets may soon be a necessity in many traditional scientific fields.

All science is fast becoming what is called data science,” says Bill Howe of UW’s eScience Institute. Today, there are sensors in gene sequencers, telescopes, forest canopies, roads, bridges, buildings, and point-of-sale terminals. Every ant in a colony can be tagged. The challenge is to extract knowledge from this vast quantity of data and transform it into something of value. Lately, Lazowska says, he has been hearing this refrain from researchers in engineering, the sciences, the social sciences, law, medicine, and even the humanities: “I am drowning in data and need help analyzing and managing it.”

Read the Full Story.


Also posted in Analytics, Research | Leave a comment

Big Data and the Humanities

Last week Columbia University held a daylong symposium titled “From Big Data to Big Ideas,” as part of the launch of its Institute for Data Science and Engineering. Despite the emphasis on technology that the Institute’s name implies, several of the participants brought up the need to include the humanities in this new world of Big Data that we as a society are embracing.

Steve Lohr of the New York Times was there. In his Sunday blog, “The Potential and the Risks of Data Science,” he says that during presentations by Columbian professors and computer scientists from various companies, including Google, Facebook, Microsoft, and Bloomberg, issues around the misuse of Big Data – such as privacy and surveillance – were mentioned only in passing.

Lohr reports however, that concerns were expressed by a panelist from a company that has pushed the limits of Big Data collection and use: Google.

My concern is that the technology is way ahead of society,” said Ben Fried, Google’s chief information officer. There is danger, he suggested, if only a technical elite understand Big Data and its implications, with the risk of a runaway technology or a public rejection. “I think it is a mistake if conversations about this technology leave out the humanities,” he said. Broader social concerns, he explained, should be a guide and will affect the spread and use of Big Data technology.”

Evidently one of the Columbia professors is planning to make the humanities part of Big Data. Mark Hansen, director of the Institute’s New Media Center, is teaching his students from the University’s Graduate School of Journalism how to do some programming and understand the algorithms underlying Big Data. “Software algorithms, he (Hansen) said, are not impartial,” writes Lohr. “They are written by people, and can embody human values and biases.”

In their book, Big Data, Viktor Mayer-Schönberger and Kenneth Cukier, devote several chapters to the perils of Big Data misuse and possible solutions. What they envision goes far beyond training journalists – whom Hansen calls “society’s explainers of last resort.”

The authors call for the creation of a new professional, what they call the “algorithmist.” These algorithmists would be experts in computer science, mathematics, and statistics and would act as reviewers of Big Data analyses and predictions.

And, given the tenor of the discussion at the Columbia symposium, it probably wouldn’t hurt if these new Big Data watchdogs had some training that included a smattering of the humanities – such as readings in the philosophy of the Enlightenment, the history of the Roman Empire, or what the Lake Poets of the 19th century were all about.

“Getting and spending we lay waste our powers,” cautioned William Wordsworth. Take heed.


Also posted in Business of Big Data | Leave a comment

Big Data Freeway Under Construction in San Diego

If you’ve ever driven the freeways of Southern California, you might wonder about the metaphor chosen to describe the new high speed, Big Data network announced this week by the University of California, San Diego.

Known as the Prism@UCSD project, the university is building a high performance cyberinfrastructure to support bursts of Big Data between campus facilities housing diverse disciplines – such as science, engineering, medicine and the arts – without killing the main campus network.

With $500,000 in funding from the National Science Foundation (NSF), the UCSD division of the California Institute for Telecommunications and Information Technology (Calit2) is developing Prism specifically to support researchers in such data-intensive scientific areas as genomic sequencing, climate science, electron microscopy, oceanography and physics.

We’ve identified a variety of big data users on this campus who need ten gigabit/s and faster bandwidth to deal with the avalanche of data coming from scientific instruments such as sequencers, microscopes and computing clusters,” said Philip Papadopoulos, principal investigator on the Prism@UCSD project, who splits his time between Calit2 and the university’s San Diego Supercomputer Center (SDSC). “We’re starting at 1 Terabit/s of connected capacity through our next-generation modular switch, which is at the center of the Prism network. It can carry 20 times the traffic of our current research network, and it’s 100 times the bandwidth of the main campus network.”

Adds Papadopoulos, “You can think of Prism as the HOV lane, whereas our very capable campus network represents the slower lanes on the freeway.” Let’s hope he’s talking about the freeway at three in the morning.

Prism@UCSD is a response to the growing challenge of Big Data,” said Calit2 Director Larry Smarr. “The key innovation in Prism@UCSD is to provide end-to-end dedicated large bandwidth to the end-users on campus.”

And he too invokes the freeway metaphor: “The Prism Big Data network also creates a high-capacity ‘data freeway’ to campus, national or international networks,” adds Smarr.

A roadway that has an aggregate bandwidth equivalent to over one terabit per second could go a long way to clearing up Southern California’s traffic problems.

Read the Full Story.


Also posted in Hardware, I/O, Network, Research | Leave a comment

Video: Brave New Hadoop

In this video, Ph.D candidate Jerome Mitchell from Indiana University details the benefits of Hadoop as well as offering a hands-on session illustrating its uses. You can also see Part 2, Part 3, Part 4, and Part 5 of this lecture.


Also posted in Hadoop, Video | Leave a comment

Video: SFU in Canada Starting Big Data Masters Program

The demand is already there, so how do we begin to fill the demand for tomorrow’s data scientists? In this video, Dr. Alexandra Fedorova from Simon Fraser University in Canada describes their new Professional Master’s program on Big Data.


Also posted in Video | Leave a comment

Advertisement


View All Videos

inside-bigdata.com is a production of insideHPC, LLC. © 2011-2013 Sitemap