java Hadoop map减少总客户数量

2 年，1 月 Questions & Answers 165

我很难使用Hadoop map reduce计算两个值之间的ToClient之和

例如，我想计算[1, 15000]的toClient之和。但据我所知，map reduce处理的数据有一些共同点（标签）

我设法理解了该数据的模式：

doctor  23
doodle  34
doctor  2
doodle  5

这些是一个单词在给定文本中的出现

使用map reduce将链接给定单词的值，如下所示：

doctor [(23 2)]
doodle [(34 5)]

然后计算这些值的和

但是对于一个总额，我们从来没有像上面例子中的绳子这样的共同点。鉴于该数据集：

DS1: 1 2 3 4 5 ..... 15000

是否可以使用map reduce架构计算列表中所有ToClient的总和

public class SumMapper extends Mapper<LongWritable, Text, NullWritable, IntWritable> { protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { int sum = Arrays.stream(value.toString().split(" ")).mapToInt(Integer::valueOf).sum(); context.write(NullWritable.get(), new IntWritable(sum)); } }

public class LocalMapReduceRunner { public static void main(String[] args) throws Exception { Runtime.getRuntime().exec("rm -rf " + args[1]); Job job = Job.getInstance(new Configuration()); job.setJobName("MR_runner"); job.setJarByClass(LocalMapReduceRunner.class); job.setMapperClass(SumMapper.class); job.setMapOutputKeyClass(NullWritable.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }

Python中文网

有 Java 编程相关的问题?

java Hadoop map减少总客户数量

共 (1) 个答案

# 1 楼答案