You have a large dataset of key value pairs, where the keys are strings and the values are the integers. For each unique key, you want to identify the largest integer. In writing a MapReduce program to accomplish this, can you take advantage of a combiner?
Answer
No, a combiner would not be useful in this case.
Yes.
Yes, but the number of unique keys must be known in advance.
Yes, as long as all the keys fit into memory on each node.
Yes, as long as all the integer values that share the same key fit into memory on each node.
Question 2
Question
Given the following code from a MapReduce application:
Job job = new Job(getConf(), "MyMapReduceJob");
Path in = new Path(“source1”, “source2”);
Path out = new Path(“dest1”);
FileInputFormat.setInputPaths(job, in);
FileOutputFormat.setOutputPath(job, out);
Which one of the following statements is true?
Answer
The input for this job will be the contents of the source1/source2 folder
The input for this job will be the contents of both the source1 and source2 files
The input of this job cannot be determined by the code above
The output of this job will be the source1/source2 folder