无法使用pyspark数据帧将utm转换为latlong

2024-10-01 00:31:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我是pyspark的新手,遇到以下问题:

我想做的是: 我需要把UTM 10区的坐标转换成经纬度。我正在尝试在数据帧上实现这一点,并且已经完成了下面的工作来实现这一点。下面的代码是针对另一个帖子创建的 Converting latitude and longitude to UTM coordinates in pyspark

import utm

from pyspark.sql import SparkSession, functions, types, udf

from pyspark.sql.types import FloatType, DoubleType, StringType

spark = SparkSession.builder.appName('reddit average df').getOrCreate()


a = spark.createDataFrame([{"X": 488769.792012, "Y": 5457280.44999}])

a.show()

|            X|            Y|

+-------------+-------------+

|488769.792012|5457280.44999|

+-------------+-------------+

utm_udf_x = functions.udf(lambda x, y: utm.to_latlon(x, y, 10, 'U')[0], DoubleType())

c = a.withColumn('Latitude', utm_udf_x(functions.col('X'), functions.col('Y')))

c.show()

但是,在这样做时,我面临以下问题(粘贴在此处的前几行错误):

19/11/13 11:15:33 ERROR Executor: Exception in task 2.0 in stage 3.0 (TID 7)
net.razorvine.pickle.PickleException: expected zero arguments for construction of ClassDict (for numpy.dtype)
    at net.razorvine.pickle.objects.ClassDictConstructor.construct(ClassDictConstructor.java:23)
    at net.razorvine.pickle.Unpickler.load_reduce(Unpickler.java:707)
    at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:175)
    at net.razorvine.pickle.Unpickler.load(Unpickler.java:99)
    at net.razorvine.pickle.Unpickler.loads(Unpickler.java:112)
    at org.apache.spark.sql.execution.python.BatchEvalPythonExec$$anonfun$evaluate$1.apply(BatchEvalPythonExec.scala:90)
    at org.apache.spark.sql.execution.python.BatchEvalPythonExec$$anonfun$evaluate$1.apply(BatchEvalPythonExec.scala:89)

我尝试过改变数据类型,假设这可能是问题所在。但我可以推断这些类型是相同的。如果有人能帮我,我将不胜感激


Tags: inimportsqlnetjavafunctionspickleat
1条回答
网友
1楼 · 发布于 2024-10-01 00:31:57

utm.to\u latlon返回无法隐式转换为pysparks DoubleType的numpy对象:

type(utm.to_latlon(488769.792012 , 5457280.44999, 10, 'U')[0])
#Output
#numpy.float64

只需调用.item()即可获得一个普通的python float对象,该对象可以转换为python类型:

utm_udf_x = functions.udf(lambda x, y: utm.to_latlon(x, y, 10, 'U')[0].item(), DoubleType())

相关问题 更多 >