擅长:python、mysql、java
<p><code>withColumn</code>已经发布了,因此除了已有的</em>之外,很难找到一种更快的方法。您可以尝试定义一个<code>udf</code>函数,如下所示</p>
<pre><code>from pyspark.sql import functions as f
from pyspark.sql import types as t
def containsUdf(listColumn):
row = {}
for column in list_of_column_names:
if(column in listColumn):
row.update({column: 1})
else:
row.update({column: 0})
return row
callContainsUdf = f.udf(containsUdf, t.StructType([t.StructField(x, t.StringType(), True) for x in list_of_column_names]))
df.withColumn('struct', callContainsUdf(df['list_column']))\
.select(f.col('list_column'), f.col('struct.*'))\
.show(truncate=False)
</code></pre>
<p>它应该给你</p>
^{pr2}$
<p><strong>注意:<code>list_of_column_names = ["Foo","Bar","Baz"]</code></strong></p>