如何将mongo db对象id强制转换为PySpark数据类型

2024-10-02 18:27:22 发布

您现在位置：Python中文网/ 问答频道 /正文

8540

网友

男 | 程序猿一只，喜欢编程写python代码。

根据this ticket，众所周知，PySpark没有针对BSON对象id的数据类型的任何直接实现（如果我错了，请纠正我，因为我主要是自学）

But after observing the schema in PySpark, I thought that following way may work for such a type of object.

StructField('userId',ArrayType(StructType([StructField("oid", StringType())])))

通过在写入Mongo DB时添加模式，它起到了作用

但我们都知道

When using filters with DataFrames or the Python API, the underlying Mongo Connector code constructs an aggregation pipeline to filter the data in MongoDB before sending it to Spark.

那么，如果某个键的值为BSON对象Id类型，是否有任何方法可以从该键进行筛选？

因为当我使用上述模式方法进行阅读时，它给了我以下错误

 org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 107.0 failed 4 times, most recent failure: Lost task 0.3 in stage 107.0 (TID 3633, ip-10-0-1-254.ec2.internal, executor 3): com.mongodb.spark.exceptions.MongoTypeConversionException:
 Cannot cast OBJECT_ID into a ArrayType(StructType(StructField(oid,StringType,true)),true) (value: BsonObjectId{value=6e8868d4b1d369a754863961})
        at com.mongodb.spark.sql.MapFunctions$.com$mongodb$spark$sql$MapFunctions$$convertToDataType(MapFunctions.scala:214)
        at com.mongodb.spark.sql.MapFunctions$$anonfun$3.apply(MapFunctions.scala:37)
        at com.mongodb.spark.sql.MapFunctions$$anonfun$3.apply(MapFunctions.scala:35)

Tags： the to 对象 in com sql mongodb stage

0条回答

目前没有回答

如何将mongo db对象id强制转换为PySpark数据类型

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何将mongo db对象id强制转换为PySpark数据类型

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >