我想读一些日志,但我不能。到目前为止,我试过:
hadoop fs -text <file>
但我得到的唯一结果是:INFO compress.CodecPool: Got brand-new decompressor [.lz4]
(与.snapy相同)
val rawRdd = spark.sparkContext.sequenceFile[BytesWritable, String](<file>)
它返回我<file> is not a SequenceFile
val rawRdd = spark.read.textFile(<file>)
在这种情况下java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
将文件下载到本地文件系统,然后使用lz4 -d <file>
解压缩并尝试查看内容
我跟踪了this SO post:
with open (snappy_file, "r") as input_file:
data = input_file.read()
decompressor = snappy.hadoop_snappy.StreamDecompressor()
uncompressed = decompressor.decompress(data)
但当我想print(uncompressed)
时,我只能得到' 'b
目前没有回答
相关问题 更多 >
编程相关推荐