用Pysp将数据存储到Accumulo - 问答 - Python中文网

用Pysp将数据存储到Accumulo

2024-06-26 00:06:45 发布

您现在位置：Python中文网/ 问答频道 /正文

男 | 程序猿一只，喜欢编程写python代码。

我尝试使用Pyspark（Python+Spark）将数据存储到acumulo中。现在我正在使用pyaccumulo库通过使用pyFiles参数将pyaccumulo egg文件传递到SparkContext来将数据写入Accumulo。我想知道有没有更好的办法。我已经看到了Cassandra和HBase输出格式的示例，并想知道是否可以对acumulo执行类似的操作。Cassandra和HBase似乎在使用saveAsNewAPIHadoopdataset（conf、keyConv、valueConv）函数并传递一个config dict、一个keyconverter和一个valueconverter。对于Accumulo要传递给saveAsNewAPIHadoopDataset（）的对应值，有人知道吗？在

Tags：文件数据参数 egg 格式 spark pyspark hbase

1条回答

网友

1楼 · 发布于 2024-06-26 00:06:45

猜猜看，我不知道它是怎么工作的，你需要这样的东西

在AccumuloOutputFormat.ConnectorInfo.principal在
在AccumuloOutputFormat.ConnectorInfo.token在
在AccumuloOutputFormat.InstanceOpts.zooKeepers在
在AccumuloOutputFormat.InstanceOpts.name在

为了获得完整的属性列表，我将运行一个普通的MapReduce示例（http://accumulo.apache.org/1.7/examples/mapred.html）并查看配置值。在

相关问题更多 >

编程相关推荐

热门问题

热门文章