HDF5中的scipy稀疏矩阵。
h5sparse的Python项目详细描述
请访问Github repository 更多信息。
h5sparse
HDF5中的scipy稀疏矩阵。
安装
pip install h5sparse
测试
对于单个环境:
python setup.py test
对于所有环境:
tox
示例
创建数据集
In[1]:importscipy.sparseasss...:importh5sparse...:importnumpyasnp...:In[2]:sparse_matrix=ss.csr_matrix([[0,1,0],...:[0,0,1],...:[0,0,0],...:[1,1,0]],...:dtype=np.float64)In[3]:# create dataset from scipy sparse matrix...:withh5sparse.File("test.h5")ash5f:...:h5f.create_dataset('sparse/matrix',data=sparse_matrix)In[4]:# you can also create dataset from another dataset...:withh5sparse.File("test.h5")ash5f:...:h5f.create_dataset('sparse/matrix2',data=h5f['sparse/matrix'])In[5]:# you can also create dataset using the formats that original h5py accepts...:withh5sparse.File("test.h5")ash5f:...:h5f.create_dataset('sparse/matrix3',data=[1,2,3])
读取数据集
In[6]:h5f=h5sparse.File("test.h5")In[7]:h5f['sparse/matrix'][1:3]Out[7]:<2x3sparsematrixoftype'<class 'numpy.float64'>'with1storedelementsinCompressedSparseRowformat>In[8]:h5f['sparse/matrix'][1:3].toarray()Out[8]:array([[0.,0.,1.],[0.,0.,0.]])In[9]:h5f['sparse']['matrix'][1:3].toarray()Out[9]:array([[0.,0.,1.],[0.,0.,0.]])In[10]:h5f['sparse']['matrix'][2:].toarray()Out[10]:array([[0.,0.,0.],[1.,1.,0.]])In[11]:h5f['sparse']['matrix'][:2].toarray()Out[11]:array([[0.,1.,0.],[0.,0.,1.]])In[12]:h5f['sparse']['matrix'][-2:].toarray()Out[12]:array([[0.,0.,0.],[1.,1.,0.]])In[13]:h5f['sparse']['matrix'][:-2].toarray()Out[13]:array([[0.,1.,0.],[0.,0.,1.]])In[14]:h5f['sparse']['matrix'][()].toarray()Out[14]:array([[0.,1.,0.],[0.,0.,1.],[0.,0.,0.],[1.,1.,0.]])In[15]:importh5pyIn[16]:h5py_h5f=h5py.File("test.h5")In[17]:h5sparse.Group(h5py_h5f.id)['sparse/matrix'][()]Out[17]:<4x3sparsematrixoftype'<class 'numpy.float64'>'with4storedelementsinCompressedSparseRowformat>In[18]:h5sparse.Group(h5py_h5f['sparse'].id)['matrix'][()]Out[18]:<4x3sparsematrixoftype'<class 'numpy.float64'>'with4storedelementsinCompressedSparseRowformat>In[19]:h5sparse.Dataset(h5py_h5f['sparse/matrix'])[()]Out[19]:<4x3sparsematrixoftype'<class 'numpy.float64'>'with4storedelementsinCompressedSparseRowformat>
追加数据集
In[20]:to_append=ss.csr_matrix([[0,1,1],...:[1,0,0]],...:dtype=np.float64)In[21]:h5f.create_dataset('matrix',data=sparse_matrix,chunks=(100000,),...:maxshape=(None,))In[22]:h5f['matrix'].append(to_append)In[23]:h5f['matrix'][()]Out[23]:<6x3sparsematrixoftype'<class 'numpy.float64'>'with7storedelementsinCompressedSparseRowformat>In[24]:h5f['matrix'][()].toarray()Out[24]:array([[0.,1.,0.],[0.,0.,1.],[0.,0.,0.],[1.,1.,0.],[0.,1.,1.],[1.,0.,0.]])