HBase和集成测试

1条回答

网友

1楼 · 发布于 2024-05-20 17:32:43

下面是使用Happybase API和Thrift Server从Python读取HBase数据的简单方法。你知道吗

要在Hbase服务器上启动thrift server：

/YOUR_HBASE_BIN_DIR/hbase-daemon.sh start thrift

然后从Python：

import happybase

HOST = 'Hbase server host name here'
TABLE_NAME = 'MyTable'
ROW_PREFIX = 'MyPrefix'
COL_TXT = 'CI:BO'.encode('utf-8') # column family CI, column name BO (Text)
COL_LONG = 'CI:BT'.encode('utf-8') # column family CI, column name C (Long)

conn = happybase.Connection(HOST) # uses default port 9095, but provide second arg if non-default port
myTable = conn.table(TABLE_NAME)

for rowID, row in myTable.scan(row_prefix=ROW_PREFIX.encode('utf-8')): # or leave empty if want full table scan
    colValTxt = row[COL_TXT].decode('utf-8')
    colValLong = int.from_bytes(row[COL_LONG], byteorder='big')
    print('Row ID: {}\tColumn Value: {}'.format(rowID, colValTxt))
print('All Done')

正如在评论中所讨论的，如果您尝试将内容传递给Spark workers，这将不起作用，因为上面的HBase连接是不可序列化的。所以只能从主程序运行这种类型的代码。如果你想办法分享！你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

HBase和集成测试

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >