pytables和pandas字符串填充问题

1条回答

网友

1楼 · 发布于 2024-09-30 18:24:59

我帮不了你的C代码。可以在Pytables中使用填充字符串。我可以读取由创建混合类型的结构数组（包括填充字符串）的C应用程序编写的数据。（注意：在复制带填充的NumPy结构数组时出现了一个问题。该问题在3.5.0中已修复。有关详细信息，请阅读以下内容：PyTables GitHub Pull 720。）

下面是一个示例，它显示了对PyTables创建的文件的正确字符串处理。也许这会帮助你调查你的问题。检查数据集的属性将是一个良好的开端

import tables as tb
import numpy as np

arr = np.empty((10), 'S10')
arr[0]='test'
arr[1]='one'
arr[2]='two'
arr[3]='three'

with tb.File('SO_63184571.h5','w') as h5f:
    ds = h5f.create_array('/', 'testdata', obj=arr)
    print (ds.atom)
    
    for i in range(4):
        print (ds[i])
        print (ds[i].decode('utf-8'))

添加以下示例以演示具有int和固定字符串的复合数据集。这在PyTables中称为Table（数组总是包含同构值）。这可以通过多种方式实现。我展示了我喜欢的两种方法：

创建一个记录数组并使用description=或 obj=参数。当你已经有了所有的数据并且可以存储在内存中时，这是很有用的
使用description=创建记录数组数据类型和引用参数然后使用.append()方法添加数据。这是当内存中无法容纳所有数据或需要向现有表中添加数据时，此功能非常有用

代码如下：

recarr_dtype = np.dtype( 
                { 'names':   ['ints', 'strs' ], 
                  'formats': [int, 'S10'] } )
a = np.arange(5)
b = np.array(['a', 'b', 'c', 'd', 'e']) 
recarr = np.rec.fromarrays((a, b), dtype=recarr_dtype) 

with tb.File('SO_63184571.h5','w') as h5f:
    ds1 = h5f.create_table('/', 'compound_data1', description=recarr)
    
    for i in range(5):
        print (ds1[i]['ints'], ds1[i]['strs'].decode('utf-8'))

    ds2 = h5f.create_table('/', 'compound_data2', description=recarr_dtype)
    ds2.append(recarr)
    
    for i in range(5):
        print (ds2[i]['ints'], ds2[i]['strs'].decode('utf-8'))

相关问题更多 >

编程相关推荐

热门问题

热门文章

pytables和pandas字符串填充问题

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >