System Design - Reading Note of MapReduce Paper

Subscribe Send me a message home page tags

#system design  #MapReduce 

This post is a reading note of the MapReduce paper and other online materials.

Related Readings


There is a hidden component here. The MapReduce system can work because it's built on a global file system. The GFS is very important because it can help to achieve data locality which in turn saves network bandwidth. Distributed file systems usually have multiple replicas for files and this increases the probability that workers already have the needed input file stored on the local disk.

This is highlighted in the Apache MapReduce document:

Typically the compute nodes and the storage nodes are the same, that is, the MapReduce framework and the Hadoop Distributed File System (see HDFS Architecture Guide) are running on the same set of nodes. This configuration allows the framework to effectively schedule tasks on the nodes where data is already present, resulting in very high aggregate bandwidth across the cluster.


The workflow consists of the following steps

  1. Split inputs
  2. Create and assign map tasks and reduce tasks.
  3. Execute map tasks and the outputs are saved in intermediate files on the local disk of map worker machines. Note that the outputs are partitioned.
  4. Partitioned outputs are read by the reduce workers. Each reduce worker handles a specific partition determined by the partitioning function. The process of fetching the intermediate data is called shuffle.
  5. Reduce workers sort the received records by key.
  6. Reduce workers apply the reduce function and write the results to an output file.


The above diagram is a little bit "misleading": the input and output files seem to be outside the cluster. In reality they can be stored on the machines in the cluster.)

----- END -----

If you have questions about this post, you could find me on Discord.
Send me a message Subscribe to blog updates

Want some fun stuff?