我有一个如下所示的Spark数据帧,我试图从一个变量添加一个新的日期列,但给出了一个错误。在
在jsonDF.printSchema()
root
|-- Data: struct (nullable = true)
| |-- Record: struct (nullable = true)
| | |-- FName: string (nullable = true)
| | |-- LName: long (nullable = true)
| | |-- Address: struct (nullable = true)
| | | |-- Applicant: array (nullable = true)
| | | | |-- element: struct (containsNull = true)
| | | | | |-- Id: long (nullable = true)
| | | | | |-- Type: string (nullable = true)
| | | | | |-- Option: long (nullable = true)
| | | |-- Location: string (nullable = true)
| | | |-- Town: long (nullable = true)
| | |-- IsActive: boolean (nullable = true)
|-- Id: string (nullable = true)
两种方法都试过了-
^{pr2}$但我有个错误
An error occurred while calling o50.withColumn.
: org.apache.spark.sql.AnalysisException: cannot resolve '`2019-07-15`' given input columns: [Data, Id];;
'Project [Data#8, Id#9, to_date('2019-07-15, Some(yyyy-MM-dd)) AS my_date#213]
+- Relation[Data#8, Id#11] json
An error occurred while calling o50.select.
: org.apache.spark.sql.AnalysisException: cannot resolve '`2019-07-15`' given input columns: [Data, Id];;
'Project [to_date('2019-07-15, Some(yyyy-MM-dd)) AS to_date(`2019-07-15`, 'yyyy-MM-dd'#210]
请帮忙。在
根据官方文档,^{} 以列作为参数。因此,它试图获取名为
2019-07-15
的列。在必须先将值转换为列,然后应用函数。在
或者另一种方法是直接使用python日期时间。在
^{pr2}$相关问题 更多 >
编程相关推荐