将pyspark映射写入txt fi时出错

2024-10-01 11:28:56 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在对两个文件的内容进行块乘法。最后我试着把结果写进一个文本文件,我得到以下错误

Py4JJavaError: An error occurred while calling o426.saveAsTextFile

以及

ValueError: could not convert string to float: 

节目:

    import numpy as np
    from pyspark import SparkContext, SparkConf
    sc = SparkContext("local", "Simple App")
    mat = sc.textFile("mat1.txt")
    mat2 = sc.textFile("mat2.txt")

    matFilter = mat.map(lambda x: [float(i) for i in x.split(" ")])
    matFilter2 = mat2.map(lambda x: [float(i) for i in x.split(" ")])

    matgroupp = matFilter.map(lambda x: (x[0], [x[2]])).reduceByKey(lambda p,q: p+q)
    matgroup2 = matFilter2.map(lambda x: (x[1], [x[2]])).reduceByKey(lambda p,q: p+q)

    matInter = matgroupp.cartesian(matgroup2)

    matmul = matInter.map(lambda x: ((x[0][0], x[1][0]), np.dot(x[0][1], x[1][1]))).sortByKey(True)
    matmul.saveAsTextFile("results/res.txt")

mat1.txt的内容

0 0 10.0
1 0 10.0

mat2.txt的内容

0 0 20.0
0 1 10.0

Tags: lambdaimporttxtmap内容npfloatsc