pyspark AttributeError:“非类型”对象没有属性“groupby”

2024-10-06 12:32:09 发布

您现在位置:Python中文网/ 问答频道 /正文

我尝试按多个列进行分组,并按计数对它们进行排序,然后获得每个组的最高记录

df.groupby("_c21","y2_co","y2_r","y2_z","y2_org").count()\
    .show(n=10)

我尝试过按不为null的单个列进行分组

df.groupby("_c21").count()\
    .show(n=10)

AttributeError: 'NoneType' object has no attribute 'groupby'

样本行

+--------------------+--------------------+--------------------+-----+----+-----+--------------------+
|                _c17|                _c21|                   m|y2_co|y2_r| y2_z|              y2_org|
+--------------------+--------------------+--------------------+-----+----+-----+--------------------+
|proc=;app=;cl=442...|tHO$SZPbABVo3A1X8...|[proc -> , app ->...|   BR|  PB|58397|Voax Provedor de ...|
|proc=;app=;cl=444...|tHO$SZPbABVo3A1X8...|[proc -> , app ->...|   BR|  PB|58397|Voax Provedor de ...|
|proc=;app=;cl=145...|Zu6zZxiekXnHfpNER...|[proc -> , app ->...|   MX| NLE|66490|           Totalplay|
|proc=;app=;cl=145...|Zu6zZxiekXnHfpNER...|[proc -> , app ->...|   MX| NLE|66490|           Totalplay|
|proc=;app=;cl=147...|Zu6zZxiekXnHfpNER...|[proc -> , app ->...|   MX| NLE|66490|           Totalplay|
+--------------------+--------------------+--------------------+-----+----+-----+--------------------+

Tags: orgappdfclshowcountprocgroupby
1条回答
网友
1楼 · 发布于 2024-10-06 12:32:09

我在上一次发言中有一个.show(n=5)。我把.show(n=5)注释掉了,它就行了

df.withColumn('m', F.expr("str_to_map(_c17,';','=')")) \
                .select("*",*[F.col('m')[k].alias(k) for k in keys])
#               .show(n=5)

相关问题 更多 >