两个Spark数据帧的并集

2024-10-02 08:26:28 发布

男 | 程序猿一只，喜欢编程写python代码。

我尝试在Python中的两个Spark数据帧之间进行联合，其中一个有时是空的，我做了一个测试，如果，返回完整的。例如下面的一个小代码，它返回一个错误：

>>> from pyspark.sql.types import *
>>> fulldataframe = [StructField("FIELDNAME_1",StringType(), True),StructField("FIELDNAME_2", StringType(), True),StructField("FIELDNAME_3", StringType(), True)]
>>> schema = StructType([])
>>>
>>> dataframeempty = sqlContext.createDataFrame(sc.emptyRDD(), schema)
>>> resultunion = sqlContext.createDataFrame(sc.emptyRDD(), schema)
>>> if (fulldataframe.isEmpty()):
...     resultunion = dataframeempty
... elif (dataframeempty.isEmpty()):
...     resultunion = fulldataframe
... else:
...     resultunion=fulldataframe.union(dataframeempty)
...


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'list' object has no attribute 'isEmpty'
>>>

有人能告诉我哪里出了问题？在

Tags：数据 true schema spark fieldname sc structfield stringtype

1条回答

网友

1楼 · 发布于 2024-10-02 08:26:28

计数可能需要很长时间。在Scala中：

dataframe.rdd.isEmpty()

两个Spark数据帧的并集

相关问题更多 >

编程相关推荐

热门问题

热门文章

两个Spark数据帧的并集

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >