将spark dataframe列传递给geohash函数pyspark。无法将列转换为布尔值：

pgh.encode(geoCordsSchema.lat, geoCordsSchema.long, precision = 7) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Python/2.7/site-packages/pygeohash/geohash.py", line 96, in encode if longitude > mid: File "/usr/local/spark/python/pyspark/sql/column.py", line 427, in __nonzero__ raise ValueError("Cannot convert column into bool: please use '&' for 'and', '|' for 'or', " ValueError: Cannot convert column into bool: please use '&' for 'and', '|' for 'or', '~' for 'not' when building DataFrame boolean expressions.

1条回答

网友

1楼 · 发布于 2024-06-28 11:14:11

不能直接在某个函数上使用column来转换它。你可以用自定义项来实现

from pyspark.sql import function as F
udf1 = F.udf(lambda x,y: pgh.encode(x,y,precision=7))
geoCordsSchema.select('lat','long',udf1('lat','long').alias('encodedVal')).show()
+ -+  +     -+
|lat|long|encodedeVal|
+ -+  +     -+
| 45|  25|    sxczbzu|
| 75|  22|    umrdst7|
| 85|  20|    urn5x1g|
| 89|  26|    uxf6r9u|
+ -+  +     -+

相关问题更多 >

编程相关推荐

热门问题

热门文章