Paper Title
Mapreduce Application Analytic Model And Performance Evaluation
Abstract
MapReduce highly scalable, fault-tolerant and data parallel framework that transparently distributes the data and parallelizes the computation tasks on large-scale compute clusters. Its application in distributed systems is a rapidly emerging field. Although this framework can leverage clusters to improve computing performance, tuning it is still challenging. Most current works related to MapReduce performance are based on system monitoring and simulation, and lack analytical performance models. In this paper, we propose a MapReduce Application Analytic (M2A) model for better understanding the impact of each component on overall program performance, and verify it in a small cluster. The results indicate that our model can explain the performance of MapReduce system and its relation to the configuration. According to M2A model, a traditional Least Significant Bit (LSB) algorithm is transformed into MapReduce-LSB (MR-LSB) and the performance can be improved significantly by modifying the core configuration parameters in hadoop without altering the framework itself. Experimental evaluation verified that the MR LSB accelerates the total execution time of large test image by a factor of 8.73 compared with the traditional non-MR LSB method.
Index terms- MapReduce, Hadoop, Analytic Model, Performance Evaluation.