In this video from the LAD’14 Lustre Administrators and Developers Conference in Reims, Rekha Singhal from Tata Consultancy Services presents: Performance Comparison of Intel Enterprise Edition Lustre and HDFS for MapReduce Applications.
IBM Platform Computing products can save an organizations money by reducing a variety of direct costs associated with grid and cluster computing. Your organization can slow the rate of infrastructure growth and reduce the costs of management, support, personnel and training—while also avoiding hidden or unexpected costs.
For a long time, the industry’s biggest technical challenge was squeezing as many compute cycles as possible out of silicon chips so they could get on with solving the really important, and often gigantic problems in science and engineering faster than was ever thought possible. Now, by clustering computers to work together on problems, scientists are free to consider even larger and more complex real-world problems to compute, and data to analyze.
“Fortissimo Foundation is a clustered, pervasive, global direct-remote I/O access system that linearly scales I/O bandwidth, memory, Flash and hard disk storage capacity and server performance to provide an “in-memory” scale-out solution that intelligently aggregates all resources of a data center cluster into a massive global name space, bridging all remote compute and storage resources to look and act as if they were local.”
As compute speed advanced towards its theoretical maximum, the HPC community quickly discovered that the speed of storage devices and the underlying the Network File System (NFS) developed decades ago had not kept pace. As CPUs got faster, storage became the main bottleneck in high data-volume environments.
“This is the first truly disruptive advancement in high-end server technology in decades, with radical technology changes and the full support of an open server ecosystem that will seamlessly lead our clients into this world of massive data volumes and complexity,” said Tom Rosamilia, Senior Vice President, IBM Systems and Technology Group. “There no longer is a one-size-fits-all approach to scale out a data center. With our membership in the OpenPOWER Foundation, IBM’s POWER8 processor will become a catalyst for emerging applications and an open innovation platform.”
Scientific research in the life sciences is often akin to searching for needles in haystacks. Finding the one protein, chemical, or genome that behaves or responds in the way the scientist is looking for is the key to the discovery process. For decades, high performance computing (HPC) systems have accelerated this process, often by helping to identify and eliminate in feasible targets sooner.
“As InfiniBand is getting used in scientific computing environments, there is a big demand to harness its benefits for enterprise environments for handling big data and analytics. This talk will focus on high-performance and scalable designs of Hadoop using native RDMA support of InfiniBand and RoCE. Designs for various components in Hadoop (such as HDFS, MapReduce, RPC, and HBASE) and their benefits based on the RDMA package for Apache Hadoop will be presented. RDMA-based design for scalable Memcached (used in Web 2.0) and the associated benefits will be presented.”
“Our architecture permits tens of thousands of SSDs to be connected together and accessed in a parallel and concurrent way using direct mapping of memory accesses from a local machine to the I/O bus and memory of a remote machine. This feature allows for data transmission between local and remote system memories without the use of operating system services. It also enables a unique linear scalability of SSDs bandwidth and IOPS and consequently allows computation and data access to scale together linearly. This totally eliminates the bottleneck in bandwidth or IOPS and provides optimal dimensions of performance, capacity, and computation with an unmatched flexibility at a fraction of the costs.”
“Our thought process was that Big Data + a better Workflow = Big Workflow. We coined it as an industry term to denote a faster, more accurate and more cost-effective big data analysis process. It is not a product or trademarked name of Adaptive’s, and we hope it becomes a common term in the industry that is synonymous with a more efficient big data analysis process.”