Paper: A Map-Reduce-Like System for Emerging Parallel Architectures

Can MapReduce be used as an effective means of processing data-intensive HPC workloads? In his dissertation from Ohio State University, Wei Jiang writes that one first needs to overcome with performance scaling, fault tolerance, and GPU acceleration support.

We performed a comparative study showing that the map-reduce processing style could cause significant overheads for a set of data mining applications. Based on the observation, we developed a map-reduce system with an alternate API (MATE) using a user-declaredreduction-object to be able to further improve the performance of map-reduce programs in multi-core environments. To address the limitation in MATE that the reduction object must fit in memory, we extended the MATE system to support the reduction object ofarbitrary sizes in distributed environments and apply it to a set of graph mining applications, obtaining better performance than the original graph mining library based on map-reduce.

Download the paper (PDF).

Resource Links: