Sign up for our newsletter and get the latest big data news and analysis.

NCSA Taps 380 Petabyte High Performance Storage System

Blue Waters' 380 petabyte High Performance Storage System (HPSS)

Today NCSA announced that its 380 Petabyte High Performance Storage System is now in full service production as part of the Blue Waters project. Described as the world’s largest automated near-line data repository for open science, the HPSS environment comprises multiple automated tape libraries, dozens of high-performance data movers, a large 40 Gigabit Ethernet network, hundreds of high-performance tape drives, and about a 100,000 tape cartridges.

This “big data” capacity is available to scientists and engineers using the sustained petascale Blue Waters supercomputer. The storage system can be easily expanded and extended to accommodate the extreme data needs of other science, engineering, or industry projects.

With the world’s largest HPSS now in production, Blue Waters truly is the most data-focused, data-intensive system available to the U.S. science and engineering community,” said Blue Waters deputy project director Bill Kramer.

The HPSS hierarchical file system software is designed to efficiently manage the access and storage of hundreds petabytes of data at high data rates. HPSS manages the life cycle of data by moving inactive data to tape and retrieving it the next time it is referenced. The highly scalable HPSS is the result of two decades of collaboration among five Department of Energy laboratories and IBM, with significant contributions by universities and other laboratories worldwide.

NCSA joined forces with the HPSS Collaboration’s Department of Energy labs and IBM to develop an HPSS capability for Redundant Arrays of Independent Tapes (RAIT)—tape technology similar to RAID for disk. RAIT dramatically reduces the total cost of ownership and energy use to store data without danger from single or dual points of failure through generated parity blocks. It also enhances the performance of data storage and retrieval since the data is stored and read/written in parallel.

Read the Full Story.

Resource Links: