将numpy float64稀疏矩阵转换为pandas数据帧

2024-09-30 14:28:38 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个n x nnumpyfloat64sparse matrixdata,其中n = 44),其中的行和列是图节点,值是边权重:

>>> data
<44x44 sparse matrix of type '<class 'numpy.float64'>'
    with 668 stored elements in Compressed Sparse Row format>

>>> type(data)
<class 'scipy.sparse.csr.csr_matrix'>

>>> print(data)
  (0, 7)    0.11793236293516568
  (0, 9)    0.10992000939300195
  (0, 21)   0.7422196678913772
  (0, 23)   0.0630039712667936
  (0, 24)   0.027037442463504143
  (0, 27)   0.16908845414214152
  (0, 28)   0.6109227233402952
  (0, 32)   0.0514765253537568
  (0, 33)   0.016341754080557713
  (1, 6)    0.015070325434709386
  (1, 10)   9.346673769086203e-05
  (1, 11)   0.2471018034781923
  (1, 14)   0.0020684269551621776
  (1, 18)   0.015258704502643251
  (1, 20)   0.021798149289490358
  (1, 22)   0.0087026831764125
  (1, 24)   0.1454235884185166
  (1, 25)   0.022060777594183015
  (1, 29)   0.9117391202819067
  (1, 30)   0.018557883854566116
  (1, 31)   0.001876070225734826
  (1, 32)   0.025841354399637764
  (1, 33)   0.014766488228364438
  (1, 39)   0.002791226433410351
  (1, 43)   1.0
  : :
  (41, 7)   0.8922099840113696
  (41, 10)  0.015776226631920767
  (41, 12)  1.0
  (41, 15)  0.1839408706622038
  (41, 18)  0.5151025641025642
  (41, 20)  0.4599130036630037
  (41, 22)  0.29378473237788827
  (41, 33)  0.47474890700697153
  (41, 39)  1.0
  (42, 2)   1.0
  (42, 10)  0.023305789342610222
  (42, 11)  0.011349136164776494
  (42, 12)  1.0
  (42, 17)  0.886081346522542
  (42, 18)  1.0
  (42, 30)  1.0
  (42, 40)  1.0
  (43, 1)   1.0
  (43, 6)   1.0
  (43, 11)  0.039948959300013256
  (43, 13)  1.0
  (43, 14)  0.02669811947637717
  (43, 29)  1.0
  (43, 30)  1.0
  (43, 36)  0.3381986531986532

我想将它转换成pandasdata frame,以便将其写入一个文件,列为:node1, node2, edge_weight,因此它将给出:

^{pr2}$

你知道怎么做吗?在

请注意:

^{3}$

给出:

                                                    0
0     (0, 7)\t0.11793236293516568\n  (0, 9)\t0.109...
1     (0, 6)\t0.015070325434709386\n  (0, 10)\t9.3...

以及

>>> pandas.DataFrame(print(data))

给出:

  (0, 7)    0.11793236293516568
  (0, 9)    0.10992000939300195

所以我想pandas.DataFrame(print(data))很接近我要找的东西。在


Tags: ofnumpydataframepandasdata节点typematrix
2条回答

这个ipython会话展示了一种方法。这两个步骤是:将稀疏矩阵转换为COO格式,然后使用COO矩阵的.row.col和{}属性创建Pandas数据帧。在

In [50]: data                                                                                                    
Out[50]: 
<15x15 sparse matrix of type '<class 'numpy.float64'>'
    with 11 stored elements in Compressed Sparse Row format>

In [51]: print(data)                                                                                             
  (1, 12)   0.8581958095588134
  (6, 12)   0.03828052946099181
  (6, 14)   0.7908634838351427
  (7, 1)    0.7995008873930302
  (7, 11)   0.48477191537121145
  (7, 13)   0.6226526443518743
  (9, 4)    0.37242576669669103
  (11, 1)   0.9604278557580955
  (11, 5)   0.13285436036287313
  (12, 11)  0.5631419223609928
  (13, 8)   0.16481624650723847

In [52]: import pandas as pd                                                                                     

In [53]: c = data.tocoo()                                                                                        

In [54]: df = pd.DataFrame({node1: c.row, node2: c.col, edge_weight: c.data})                                   

In [55]: df                                                                                                      
Out[55]: 
    node1  node2  edge_weight
0       1     12     0.858196
1       6     12     0.038281
2       6     14     0.790863
3       7      1     0.799501
4       7     11     0.484772
5       7     13     0.622653
6       9      4     0.372426
7      11      1     0.960428
8      11      5     0.132854
9      12     11     0.563142
10     13      8     0.164816

你能试试toarray

pd.DataFrame(A.toarray())

相关问题 更多 >