擅长:python、mysql、java
<p>除了这里已经给出的答案之外,如果您知道聚合列的名称(在这里您不必从<code>pyspark.sql.functions</code>导入),以下也是方便的方法:</p>
<p><strong>1</strong></p>
<pre><code>grouped_df = joined_df.groupBy(temp1.datestamp) \
.max('diff') \
.selectExpr('max(diff) AS maxDiff')
</code></pre>
<p>有关<code>.selectExpr()</code>的信息,请参见<a href="https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.DataFrame.selectExpr" rel="nofollow noreferrer">docs</a></p>
<p><strong>2</strong></p>
<pre><code>grouped_df = joined_df.groupBy(temp1.datestamp) \
.max('diff') \
.withColumnRenamed('max(diff)', 'maxDiff')
</code></pre>
<p>有关<code>.withColumnRenamed()</code>的信息,请参见<a href="https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.DataFrame.withColumnRenamed" rel="nofollow noreferrer">docs</a></p>
<p>这里的答案更详细:<a href="https://stackoverflow.com/a/34077809">https://stackoverflow.com/a/34077809</a></p>