Pyspark使用带有when和other的函数

import pyspark.sql.functions as F def getValueByCountry(country): # Possibly some more complex calculations based on country if country == "Spain": return 1 else: return 2 def getValue(currency): # Possibly some more complex calculations based on currency if currency == "EUR": return 3 else: return 4 currency_column = "Currency" df = df.withColumn( "Value", F.when( F.col(currency_column).contains("None"), getValueByCountry(F.col("Country")) ).otherwise(getValue(F.col(currency_column))), )

1条回答

网友
1楼 · 发布于 2024-09-21 05:25:46

您可以在不使用自定义项的情况下尝试以下操作：
currency_column = "Currency" df = df.withColumn( "Value", F.when( F.col(currency_column).contains("None"), F.when(F.col("Country") == "Spain", 1).otherwise(2), ).otherwise(F.when(F.col("Country") == "Russia", 4).otherwise(3)), )
尽管上述内容可能适用于提供的示例，但您可能有更多的值。这样你就可以考虑使用条件运算符：
currency_column = "Currency" df = df.withColumn( "Value", F.when(F.col(currency_column).contains("None") & F.col("Country") == "Spain", 1) .when(F.col(currency_column).contains("None") & F.col("Country") == "UK", 2) .when(F.col(currency_column) == "USD" & F.col("Country") == "Russia", 4) .when(F.col(currency_column) == "EUR" & F.col("Country") == "Netherland", 3) .otherwise(999), )
<> > {{CD1>}，用于您尚未考虑的条件。
有关spark函数的更多详细信息：here

相关问题更多 >

编程相关推荐

热门问题

热门文章