如何在apache spark中指定浮点精度？

2024-10-02 18:23:51 发布

您现在位置：Python中文网/ 问答频道 /正文

8545

网友

男 | 程序猿一只，喜欢编程写python代码。

有没有一种方法可以指定spark中浮点数的精度，最好是在将RDD写入文件之前，这样在计算时精度不会丢失？在

最小工作示例

    sqlCtxt = HiveContext(sc)

    fulldata = sqlCtxt.jsonFile(DATA_FILE)
    fulldata.registerTempTable("fulldata")

    newcpulists = sqlCtxt.sql('SELECT xxx FROM fulldata')


    def reduceSumPerc(x,y):
            #some reducefunction

    def mapfunc(x):
            #some map function

    reducedresult = newcpulists.map(mapfunc).reduceByKey(reduceSumPerc)

    # I want to reduce the precision just at this line, before writing to file.
    reducedresult.coalesce(1, True).saveAsTextFile(RESULT_PATH)

Tags： to 方法 map def 精度 some spark rdd

1条回答

网友

1楼 · 发布于 2024-10-02 18:23:51

这样的操作不在火花范围内。由于saveAsTextFile只需对非unicode数据调用unicode，对{}调用{}，所以您只需使用standard Python formatting tools手动格式化输出字符串，例如：

rdd = sc.parallelize([("foo", 0.123123132), ("bar", 0.00000001)])
rdd.map(lambda x: "{0}, {1:0.2f}".format(*x)).saveAsTextFile(...)

如何在apache spark中指定浮点精度？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在apache spark中指定浮点精度？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >