In this slidecast, Gary Tyreman from Univa discusses the new Univa Grid Engine for ARMv7 Release. As HPC and Big Data infrastructure design continues to converge, this platform could be a stepping stone to the future.
Driven by the demand for new datacenter services to support mobile and cloud computing, ARM will continue to gain in-roads into the datacenter server market because of the low-power and energy efficient design of SOC’s based on ARM’s technology”, said Karl Freund, VP Marketing at Calxeda. “As enterprises shift towards highly scalable solutions such as Calxeda, a key enabling technology is intelligent workload management – and we have partnered with Univa to provide our customers with a great solution.”
In this video, Amr Awadallah from Cloudera presents: Cloudera, Impala, and EDW Optimization.
Everyone knows that data volumes are growing exponentially. What’s not so clear is how to unlock the value it holds. The answer is Cloudera, the Platform for Big Data. With a single, integrated enterprise-class solution, Cloudera lets you efficiently query all of your data – structured and unstructured – and have a view beyond data sitting in relational databases. Equally important, Cloudera’s platform runs in real time, so you can work at the speed of thought as you build rapidly on deep insights, create competitive advantage and become truly data-driven.
In this video, Brent Gorda from Intel’s High Performance Data Division presents: The Future of Network-Based Storage. Gorda joined Intel in July 2012 as part of Intel’s acquisition of Whamcloud. Since then, Gorda’s team has continued to work on Lustre as well as conduct R&D on Darpa’s Fast Forward Storage & IO program.
In this video, John Fragalla from Xyratex presents: Architecting High Availability Lustre Storage with ClusterStor 6000.
ClusterStor 6000 is designed to support installations with linear performance scalability in less space, scaling from up to 6 gigabytes per second to installations providing 1 terabyte per second file system throughput, as well as linear data storage capacity from terabytes up to tens of petabytes.
Over at IT World, Joab Jackson writes that Python just got a big data boost from DARPA with a $3 million award to software provider Continuum Analytics. The funding will help foster the development of Python’s data processing and visualization capabilities for big data jobs.
The money will go toward developing new techniques for data analysis and for visually portraying large, multi-dimensional data sets. The work aims to extend beyond the capabilities offered by the NumPy and SciPy Python libraries, which are widely used by programmers for mathematical and scientific calculations, respectively. More mathematically centered languages such as the R Statistical language might seem better suited for big-data number crunching, but Python offers an advantage of being easy to learn.
The work is part of DARPA’s XData research program, a four-year, $100 million effort to give the Defense Department and other U.S. government agencies tools to work with large amounts of sensor data and other forms of big data. Read the Full Story.
In this video from PyData NYC 2012, Stephen Diehl from Continuum Analytics presents on Blaze, a next-generation NumPy designed as a foundational set of abstractions on which to build out-of-core and distributed algorithms. Blaze generalizes many of the ideas found in popular PyData projects such as Numpy, Pandas, and Theano into one generalized data-structure. Together with a powerful array-oriented virtual machine and run-time, Blaze will be capable of performing efficient linear algebra and indexing operations on top of a wide variety of data backends.
In this video, Brent Gorda from Intel’s High Performance Data Division provides an update on the Lustre File System development. Gorda joined Intel in July 2012 as part of Intel’s acquisition of Whamcloud. Since then, Gorda’s team has continued to work on Lustre as well as conduct R&D on Darpa’s Fast Forward Storage & IO program.
Gorda goes on to share that Intel has now upgraded their participation in the OpenSFS community to the Board (Promoter) level, joining Cray, DDN, LLNL, ORNL, and Xyratex.
In this slidecast, Floyd Christofferson from SGI describes how the combination of the company’s Infinite Storage platform and Scality Ring technology provide a new, unified scale-out storage system. The solution is designed to provide both extreme scale and high performance, allowing customers to manage storage of massive stores of unstructured data.
Scale-out object-based solutions are designed to address this particular set of problems by minimizing manual intervention for storage expansions, migrations, and recoveries from storage system failure,” said Ashish Nadkarni, research director, Storage Systems at IDC. “Such a dispersed, fault-tolerant architecture enables IT organizations to more efficiently absorb data growth in a manner that is predicable for the long term.”
In this slidecast, Eric Barton, Lead Architect for Intel’s High Performance Data Division presents a progress update on the Fast Forward I/O & Storage program.
Back in July 2012, Whamcloud was awarded the Storage and I/O Research & Development subcontract for the Department of Energy’s FastForward program. Shortly afterward, the company was acquired by Intel. The two-year contract scope includes key R&D necessary for a new object storage paradigm for HPC exascale computing, and the developed technology will also address next-generation storage mechanisms required by the Big Data market.
The subcontract incorporates application I/O expertise from the HDF Group, system I/O and I/O aggregation expertise from EMC Corporation, object storage expertise from DDN, and scale testing facilities from Cray, teamed with file system, architecture, and project management skills from Whamcloud. All components developed in the project will be open sourced and benefit the entire Lustre community.
This is a fascinating presentation for those interested in how an Exascale system might handle data, and the prototype that comes out of it may well represent the roadmap to the future of supercomputing.
In this slidecast, CEO Bill Bain from ScaleOut Software presents: In-Memory Data Grids Enable Real-Time Analysis.
ScaleOut Software is a pioneer and leader in data grid software. Since our first products shipped in January 2005, we have consistently developed leading-edge technologies that help our customers solve scalability and performance challenges and gain competitive advantages for their businesses.”
Over at Enterprise Storage Forum, Henry Newman looks at the future of file systems and examines whether REST will overtake POSIX as an interface of choice for all applications.
We do not have a lot of POSIX file systems that scale today to 10s of PB and billions of files. There are three file systems in production with a parallel namespace (Gluster, PAN-FS, Lustre, and GPFS) and a new entry called Ceph. Ceph, GPFS Lustre and Pan-FS support parallel I/O, which is I/O from multiple threads (these threads could be running on multiple nodes) to a single file, but Gluster does not. On the other side there are dozens of vendors developing REST- and SOAP-based object management interfaces. Vendors are trying to create systems that support billions of objects in a single namespace. Given that the vendors are not constrained by the POSIX atomicity requirements and support for parallel I/O, this is far easier than developing this support inside a POSIX file system.