擅长:python、mysql、java
<p>派斯帕克解决方案。您可以在适当排序和分区的窗口上使用<code>row_number</code>,并获取行号为1的行</p>
<pre><code>from pyspark.sql import functions as F, Window
df2 = df.withColumn(
'rn',
F.row_number().over(Window.partitionBy('name').orderBy(F.desc('frequency')))
).filter('rn = 1').drop('rn')
df2.show()
+ -+ + + -+
| name| Location|Rating|Frequency|
+ -+ + + -+
|Ahmad| Kebab|1 star| 10|
| Abu| Mcdonald|3 star| 3|
| Lee| Fries|1 star| 3|
| Ali|Baskin Robin|4 star| 3|
+ -+ + + -+
</code></pre>