java - Can I output key/value pairs such that one key is specific to one node in Hadoop MapReduce? -


suppose have code matrix multiplication. want such mapper output each node taken 1 reducer, ie. key 1 complete mapper output file same, different key of node. eg. if 1 mapper file outputs key/value pairs key 1, mapper output key/value pairs key 2 , on. understand maybe cannot done example, if output key particular datanode id or something? in, key in output partciluar id given datanode. there way this?

basically want output 1 mapper go 1 reducer somehow, , can achieved giving them one key, , want parallelism mapper output distributed through cluster, want key each unique. how assign one key data if input not organized that?

(please point out if additional information needed. thank help)

if want ensure output 1 mapper ends in same reduce instance, uou can use map task id output key:

public class mymapper extends mapper<longwritable, text, intwritable, text> {     private intwritable mapid;      @override     protected void setup(context context) throws ioexception,             interruptedexception {         mapid = new intwritable(context.gettaskattemptid().gettaskid().getid());     }      @override     protected void map(longwritable key, text value, context context)             throws ioexception, interruptedexception {         // ...          context.write(mapid, value);     } } 

Comments

Popular posts from this blog

javascript - Count length of each class -

What design pattern is this code in Javascript? -

hadoop - Restrict secondarynamenode to be installed and run on any other node in the cluster -