擅长:python、mysql、java
<p>您可以执行<code>groupby</code>并使用spark的<code>collect_list</code>函数</p>
<pre><code>import pyspark.sql.functions as F
df = spark.createDataFrame([(1, 1), (2, 0), (3, 1), (4, 1), ], ['som', 'ano'])
pyLst = df.groupby('ano').agg(F.collect_list('som').alias('pyLst')).where('ano = 1').collect()[0]['pyLst']
</code></pre>