如何在pyspark中指定列值作为列值和常量的加法?

2024-09-27 17:50:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要创建一个名为Sea freight days + Buffer的列,如果Mode等于AIR,则将值分配为final_df6['No of days take if sea freight']+destuff_buffer

destuff_buffer = 4
final_df6 = final_df6.withColumn('Sea freight days + Buffer',
    when(col("Mode")=='AIR',final_df6['No of days take if sea freight']+destuff_buffer).otherwise(np.nan)
)

但我得到以下错误

     Traceback (most recent call last): File
"/opt/amazon/bin/runscript.py", line 
67, in <module> runpy.run_path(script, run_name='__main__') File "/usr/lib64/python3.7/runpy.py", line 261, in run_path
code, fname = _get_code_from_file(run_name, path_name) File
"/usr/lib64/python3.7/runpy.py", line 236, in _get_code_from_file
code = compile(f.read(), fname, 'exec') File "/tmp/ACOEtest",
line 165 destuff_buffer = 4 ^ SyntaxError: invalid syntax 
During handling of the above exception, another exception
occurred: Traceback (most recent call last): File "/opt/amazon/bin/runscript.py", line 100, in <module> while"runpy.py" in new_stack.tb_frame.f_code.co_filename:
AttributeError: 'NoneType' object has no attribute 'tb_frame'

Tags: ofpathruninpybufferlinecode
1条回答
网友
1楼 · 发布于 2024-09-27 17:50:44
c = 40
df = spark.createDataFrame(spark.sparkContext.parallelize([('AIR',1),('NONAIR',5)]),['mode','d'])
df = df.withColumn('mycol', when(df.mode=='AIR', df.d+c).otherwise(None))
df.show()

+   + -+  -+
|  mode|  d|mycol|
+   + -+  -+
|   AIR|  1|   41|
|NONAIR|  5| null|
+   + -+  -+

相关问题 更多 >

    热门问题