JavaeShadoop序列化组织。阿帕奇。hadoop。木卫一。短写失败
我正在使用pyspark和es hadoop处理es数据
ES 7.4.0
spark 2.3.1
有一些代码
PUT test
{
"mappings": {
"properties": {
"price": {
"type": "short"
}
}
}
}
PUT test/_doc/1
{
"price": 1
}
pyspark --driver-class-path ~/jars/elasticsearch-hadoop-7.4.0.jar --jars ~/jars/elasticsearch-hadoop-7.4.0.jar
conf = {
'es.resource': 'test',
"es.nodes.wan.only": "true",
"es.nodes": 'http://localhost:9200',
"es.port": '9200',
'es.net.http.auth.user': '',
"es.net.http.auth.pass": '',
}
rdd = sc.newAPIHadoopRDD(inputFormatClass="org.elasticsearch.hadoop.mr.EsInputFormat",
keyClass="org.apache.hadoop.io.NullWritable",
valueClass="org.elasticsearch.hadoop.mr.LinkedMapWritable",
conf=conf)
ERROR:
Task 0.0 in stage 1.0 (TID 1) had a not serializable result: org.apache.hadoop.io.ShortWritable
Serialization stack:
- object not serializable (class: org.apache.hadoop.io.ShortWritable, value: 1)
- writeObject data (class: java.util.HashMap)
- object (class java.util.HashMap, {price=1})
- field (class: scala.Tuple2, name: _2, type: class java.lang.Object)
- object (class scala.Tuple2, (1,{price=1}))
- element of array (index: 0)
- array (class [Lscala.Tuple2;, size 1); not retrying
Traceback (most recent call last):
当我将short改为long时,我得到了正确的es数据,为什么short type不能序列化
共 (0) 个答案