擅长:python、mysql、java
<p>要写入序列文件,需要Hadoop API格式的数据。</p>
<p>字符串为文本<br/>
Int作为IntWritable</p>
<p>在Python中:</p>
<pre><code>data = [(1, ""),(1, "a"),(2, "bcdf")]
sc.parallelize(data).saveAsNewAPIHadoopFile(path,"org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat","org.apache.hadoop.io.IntWritable","org.apache.hadoop.io.Text")
</code></pre>