Spark python MLlib随机林内存不足

2024-09-28 18:14:17 发布

男 | 程序猿一只，喜欢编程写python代码。

我运行spark 1.2.1来训练随机森林。我在awsec2上设置了一个主节点和一个worker节点，总共为spark分配了96GB的内存。我使用不同的并行度值（32、64、6400）玩游戏，一直得到相同的错误。根据spark UI，我的RDD是277KB，100%缓存在内存中，应该很小。我的spark上下文如下：

spark.executor.memory   100000m
spark.driver.memory 90000m
spark.driver.maxResultSize 0
spark.storage.memoryFraction 0.6
spark.default.parallelism 6400
spark.eventLog.enabled true
spark.executor.extraLibraryPath /root/ephemeral-hdfs/lib/native/
spark.executor.extraClassPath   /root/ephemeral-hdfs/conf

# for spark version < 1.4.0
spark.tachyonStore.url tachyon://10.0.29.29:19998
# for spark version >= 1.4.0
spark.externalBlockStore.url tachyon://10.0.29.29:19998

误差如下：

^{pr2}$

我的数据是LabeledPoint类型的RDD，我的培训代码相当直接：

 (trainingData, testData) = data.randomSplit([0.7, 0.3])
    model = RandomForest.trainClassifier(trainingData, numClasses=2, categoricalFeaturesInfo={},
                                     numTrees=3, featureSubsetStrategy="auto",
                                     impurity='gini', maxDepth=4, maxBins=32)

Tags：内存 url for 节点 version driver root hdfs

1条回答

网友

1楼 · 发布于 2024-09-28 18:14:17

结果我的超光速粒子服务器IP不正确。配置正确的超光速服务器IP解决了我的问题。在

Spark python MLlib随机林内存不足

相关问题更多 >

编程相关推荐

热门问题

热门文章

Spark python MLlib随机林内存不足

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >