Sunday 4 January 2015

Map and reduce phase internals - Passing Key and Value pairs from mapper to reducer

Suppose the mappers generated the following output

(coder,1)
(fox,1)
(in,1)
(boots,1)
(coder,1)
(hadoop,1)

How many keys will be passed to the reducers reduce() method ?

Ans : five

Reason: The input to the reduce method of reducer will be key and list of values. The output of the mapper will be grouped based on the keys and will be send to the reducer. Here in this case after grouping based on keys we will get an output like
coder,[1,1]
fox,[1]
in,[1]
boots,[1]
hadoop,[1]

Here we have five unique keys, so the five keys will be sent to the reducer.

No comments:

Post a Comment

How to check the memory utilization of cluster nodes in a Kubernetes Cluster ?

 The memory and CPU utilization of a Kubernetes cluster can be checked by using the following command. kubectl top nodes The above command...