Providing Caches for Reduce Tasks in a MapReduce Cloud
摘要：A Map Reduce cloud is the key to the success of cloud computing nowadays by the capability of processing large datasets simultaneously on nodes in a cloud. However, a Map Reduce cloud may waste many CPU resources to frequently process similar intermediate data in its Reduce tasks because specific intermediate data is always moved to specific Slave nodes. A MapR educe cloud can utilize the proposed idea of supporting the cache mechanism for Reduce tasks to avoid unnecessary computation. In experiments, a Map Reduce cloud is proved to get great performance improvement from the help of the cache mechanism when running CPU-intensive applications. Accordingly, a Map Reduce cloud can be justified to have the extension of the cache mechanism proposed in this paper.
2016 IEEE International Conference on Big Data Analysis