<p>首先我们可以<code>stack</code>将<code>avgm</code>列转换为行,然后我们可以<code>pivot</code>将<code>srsbtp</code>行转换为列</p>
<pre class="lang-py prettyprint-override"><code>df.createOrReplaceTempView('table')
col_list = ' '.join([f"'{'avgm'+str(i+1)}', {'avgm'+str(i+1)}," for i in range(12)])[:-1]
## col_list is a string
## "'avgm1', avgm1, 'avgm2', avgm2, 'avgm3', avgm3, 'avgm4', avgm4, 'avgm5', avgm5, 'avgm6', avgm6, 'avgm7', avgm7, 'avgm8', avgm8, 'avgm9', avgm9, 'avgm10', avgm10, 'avgm11', avgm11, 'avgm12', avgm12"
result = spark.sql(f"select srab, srsbtp, stack(12, {col_list}) as (month, value) from table") \
.groupBy('srab', 'month') \
.pivot('srsbtp') \
.agg(F.sum('value')) \
.orderBy('month')
result.show()
+ + + + +
|srab| month| C| D|
+ + + + +
|2389| avgm1| null|null|
|2389|avgm10|54674.1935483871|null|
|2389|avgm11| 156820.0|null|
|2389|avgm12| null|null|
|2389| avgm2| null|null|
|2389| avgm3| null|null|
|2389| avgm4| null|null|
|2389| avgm5| null|null|
|2389| avgm6| null|null|
|2389| avgm7| null|null|
|2389| avgm8| null|null|
|2389| avgm9| null|null|
+ + + + +
</code></pre>