java是Mapreduce中按值传递还是按引用传递的键？

6 月，2 周 Questions & Answers 38

我创建了一个MapReduce作业，该作业将计算键的数量，然后根据它们出现的次数对它们进行排序

处理输入时，如

1A99
1A34
1A99
1A99
1A34
1A12

最终目标将是一个类似

1A99 3
1A34 2
1A12 1

我的地图阶段输出a<；键，1>；类型<；文本，Int可写）

我的reduce阶段有3个阶段：设置阶段，初始化数组列表以保存我的<；Text，Int Wrtiable），然后是reduce阶段，我将Int writiable相加以获得计数，然后将其插入数组，最后是清理阶段，在清理阶段我按计数对arraylist排序

数组列表中的值是我创建的一个对象myObject的值，该对象在元组中保存可写的文本和Int，我发现这是一个奇怪的现象

new myObject(key, count)

key是传递到reducer中的键，count是我通过求和值创建的int-writable（Iterable int-writable）

最后，数组中的所有键都是相同的键，而只有计数不同

但如果我这样做了

new myObject(new Text(key), count)

从本质上说，复制钥匙是可行的

我找不到任何关于从映射器传递到减速器的键是否通过引用的信息，但这似乎是唯一可能解释为什么会发生这种情况的解释

Raw data 1A99 1A34 1A99 1A99 1A34 1A12 Some number of mappers will process this, say 3: input to mapper 1 1A99 1A34 1A99 input to mapper 2 1A99 input to mapper 3 1A34 1A12 output of mapper 1 1A99, 1 1A34, 1 1A99, 1 output of mapper 2 1A99, 1 output of mapper 3 1A34, 1 1A12, 1 intermediate shuffle phase collects all values for keys 1A99, (1,1,1) 1A34, (1, 1) 1A12, (1) Now, say we force one reducer (though there may be more than one) reducer 1 input and output 1A99, (1,1,1) -> 1A99, 3 1A34, (1, 1) -> 1A34, 2 1A12, (1) -> 1A12, 1

共 (1) 个答案

# 1 楼答案
不看实际的代码，理解实际的问题有点困难。然而，您似乎不需要reduce阶段的第1阶段和第3阶段。还原程序将得到一个key（Text）和一个list of values（Iterable<IntWritable>）。这是在映射阶段之后发生的中间洗牌阶段的结果。在reduce步骤中，您应该对Iterable<IntWritable>执行任何需要执行的操作（在您的例子中，将它们相加）。这意味着对该密钥的处理已经完成。使用context.write(key, result_of_operation)，从reducer输出结果

以下是数据集的处理方式：
```
Raw data
1A99
1A34
1A99
1A99
1A34
1A12

Some number of mappers will process this, say 3:
input to mapper 1
1A99
1A34
1A99

input to mapper 2
1A99

input to mapper 3
1A34
1A12

output of mapper 1
1A99, 1
1A34, 1
1A99, 1

output of mapper 2
1A99, 1

output of mapper 3
1A34, 1
1A12, 1

intermediate shuffle phase collects all values for keys
1A99, (1,1,1)
1A34, (1, 1)
1A12, (1)

Now, say we force one reducer (though there may be more than one)

reducer 1 input and output
1A99, (1,1,1) -> 1A99, 3
1A34, (1, 1) -> 1A34, 2
1A12, (1) -> 1A12, 1
```
可能有帮助的参考资料：

https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html#Example:_WordCount_v1.0

Python中文网

有 Java 编程相关的问题?

java是Mapreduce中按值传递还是按引用传递的键？

共 (1) 个答案

# 1 楼答案