加入Sp后收集时出错

2024-06-01 20:47:46 发布

您现在位置:Python中文网/ 问答频道 /正文

我用的是Spark 1.4.1。我有两个数据帧,我想加入到一个userid字段中。我可以计算出这个模式的行数(因为我可以计算出216行):

usersearch_jnd = rr.join(uu, rr.searcher_id == uu.userid, 'inner')
print(usersearch_jnd)
print(usersearch_jnd.count())
usersearch_jnd.printSchema()

DataFrame[min_age: int, max_age: int, inter_in: int, zoom: int, searcher_ethnicity: array<bigint>, searcher_id: int, searcher_sex: int, offset: int, searchee_id: int, searchee_loc: struct<latitude:double,longitude:double>, userid: int, search_users_interested_in: int, search_level: int, search_max_age: int, search_min_age: int, search_ethnicity_multi: string, search_looking_for_sex: int, interested_in: int, search_distance: float, ethnicity: int]
216
root
 |-- min_age: integer (nullable = true)
 |-- max_age: integer (nullable = true)
 |-- inter_in: integer (nullable = true)
 |-- zoom: integer (nullable = true)
 |-- searcher_ethnicity: array (nullable = true)
 |    |-- element: long (containsNull = true)
 |-- searcher_id: integer (nullable = true)
 |-- searcher_sex: integer (nullable = true)
 |-- offset: integer (nullable = true)
 |-- searchee_id: integer (nullable = true)
 |-- searchee_loc: struct (nullable = true)
 |    |-- latitude: double (nullable = true)
 |    |-- longitude: double (nullable = true)
 |-- userid: integer (nullable = true)
 |-- search_users_interested_in: integer (nullable = true)
 |-- search_level: integer (nullable = true)
 |-- search_max_age: integer (nullable = true)
 |-- search_min_age: integer (nullable = true)
 |-- search_ethnicity_multi: string (nullable = true)
 |-- search_looking_for_sex: integer (nullable = true)
 |-- interested_in: integer (nullable = true)
 |-- search_distance: float (nullable = true)
 |-- ethnicity: integer (nullable = true)

但是,当我做一些像collect()或head()这样简单的操作时,我会遇到错误:

^{pr2}$

有什么主意我可以查一下吗?在


Tags: inidtrueagesearchintegerminmax