<p>UDF不是非常理想的解决方案,尤其是对于Python来说——主要是因为需要在JVM和Python之间发送数据。只有在必要时,才建议使用从性能角度来看更好的<a href="https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html" rel="nofollow noreferrer">Pandas UDFs</a></p>
<p>但在您的情况下,您可以像这样使用内置的<a href="https://spark.apache.org/docs/3.0.1/api/python/pyspark.sql.html#pyspark.sql.functions.when" rel="nofollow noreferrer">^{<cd1>} function</a>:</p>
<pre class="lang-py prettyprint-override"><code>>>> from pyspark.sql.functions import when,col
>>> df = spark.createDataFrame([("Bachelors", 13.0),
("Masters", 14.0), ("Preschool", 1.0)],
schema=["bildungsstand", "bildungslevel"])
>>> df2 = df.withColumn("academics_category",
when((col("bildungsstand") == "Bachelors") | (col("bildungsstand") == "Masters"),
"academic degree").otherwise("no academic degree"))
>>> df2.show()
+ -+ -+ +
|bildungsstand|bildungslevel|academics_category|
+ -+ -+ +
| Bachelors| 13.0| academic degree|
| Masters| 14.0| academic degree|
| Preschool| 1.0|no academic degree|
+ -+ -+ +
</code></pre>
<p>注意,您需要使用<code>|</code>作为<code>or</code>运算符,使用<code>&</code>作为<code>and</code>运算符,使用<code>~</code>作为<code>not</code>运算符</p>
<p>SparkByExamples有很多<a href="https://sparkbyexamples.com/pyspark/pyspark-when-otherwise/" rel="nofollow noreferrer">good description of this function</a></p>
<p>但是如果您确实有固定的值列表,那么使用<a href="https://spark.apache.org/docs/3.0.1/api/python/pyspark.sql.html#pyspark.sql.Column.isin" rel="nofollow noreferrer">^{<cd8>} function</a>检查值是否在给定值列表中更容易:</p>
<p>另外,我建议大家学习Spark,2ed,它会给你很好的介绍Spark,它的功能,等等</p>
<pre class="lang-py prettyprint-override"><code>>>> from pyspark.sql.functions import col
>>> df = spark.createDataFrame([("Bachelors", 13.0), ("Masters", 14.0), ("Preschool", 1.0)], schema=["bildungsstand", "bildungslevel"])
>>> df2 = df.withColumn("academics_category",
when(col("bildungsstand").isin(["Bachelors","Masters"]),
"academic degree").otherwise("no academic degree"))
>>> df2.show()
+ -+ -+ +
|bildungsstand|bildungslevel|academics_category|
+ -+ -+ +
| Bachelors| 13.0| academic degree|
| Masters| 14.0| academic degree|
| Preschool| 1.0|no academic degree|
+ -+ -+ +
</code></pre>