“The Hadoop framework has become the most popular open-source solution for Big Data processing. Traditionally, Hadoop communication calls are implemented over sockets and do not deliver best performance on modern clusters with high-performance interconnects. This talk will examine opportunities and challenges in optimizing performance of Hadoop with Remote DMA (RDMA) support, as available with InfiniBand, RoCE (RDMA over Converged Enhanced Ethernet) and other modern interconnects.”
IBM Platform Computing products can save an organizations money by reducing a variety of direct costs associated with grid and cluster computing. Your organization can slow the rate of infrastructure growth and reduce the costs of management, support, personnel and training—while also avoiding hidden or unexpected costs.
For a long time, the industry’s biggest technical challenge was squeezing as many compute cycles as possible out of silicon chips so they could get on with solving the really important, and often gigantic problems in science and engineering faster than was ever thought possible. Now, by clustering computers to work together on problems, scientists are free to consider even larger and more complex real-world problems to compute, and data to analyze.
“Fortissimo Foundation is a clustered, pervasive, global direct-remote I/O access system that linearly scales I/O bandwidth, memory, Flash and hard disk storage capacity and server performance to provide an “in-memory” scale-out solution that intelligently aggregates all resources of a data center cluster into a massive global name space, bridging all remote compute and storage resources to look and act as if they were local.”
As compute speed advanced towards its theoretical maximum, the HPC community quickly discovered that the speed of storage devices and the underlying the Network File System (NFS) developed decades ago had not kept pace. As CPUs got faster, storage became the main bottleneck in high data-volume environments.
“This is the first truly disruptive advancement in high-end server technology in decades, with radical technology changes and the full support of an open server ecosystem that will seamlessly lead our clients into this world of massive data volumes and complexity,” said Tom Rosamilia, Senior Vice President, IBM Systems and Technology Group. “There no longer is a one-size-fits-all approach to scale out a data center. With our membership in the OpenPOWER Foundation, IBM’s POWER8 processor will become a catalyst for emerging applications and an open innovation platform.”
Scientific research in the life sciences is often akin to searching for needles in haystacks. Finding the one protein, chemical, or genome that behaves or responds in the way the scientist is looking for is the key to the discovery process. For decades, high performance computing (HPC) systems have accelerated this process, often by helping to identify and eliminate in feasible targets sooner.
“As InfiniBand is getting used in scientific computing environments, there is a big demand to harness its benefits for enterprise environments for handling big data and analytics. This talk will focus on high-performance and scalable designs of Hadoop using native RDMA support of InfiniBand and RoCE. Designs for various components in Hadoop (such as HDFS, MapReduce, RPC, and HBASE) and their benefits based on the RDMA package for Apache Hadoop will be presented. RDMA-based design for scalable Memcached (used in Web 2.0) and the associated benefits will be presented.”