如何有效地从事务行构造亲和矩阵?

2024-09-28 01:25:03 发布

您现在位置:Python中文网/ 问答频道 /正文

给定一个(可能是大的~2+GBs)json文件中节点之间的事务,其中大约有~一百万个节点和~1000个事务,每个节点有10-1000个节点,例如

{"transactions":
 [
  {"transaction 1": ["node1","node2","node7"], "weight":0.41},
  {"transaction 2": ["node4","node2","node1","node3","node10","node7","node9"], "weight":0.67},
  {"transaction 3": ["node3","node10","node11","node2","node1"], "weight":0.33},...
  ]
}

什么是最优雅和有效的pythonic方法来将其转换为节点亲和力矩阵,其中亲和力是节点之间加权事务的总和。在

^{pr2}$

例如

affinity[node1, node7] = [0.41 (transaction1) + 0.67 (transaction2)] / 2 = affinity[node7, node1]

注:亲和矩阵是对称的,因此仅计算下三角就足够了。在

值不代表***仅结构示例!在

节点1 |节点2 |节点3 |节点4 |…
节点1.4.1.9…
node2.4 1.6.3…
node3.1.61.7…
节点4.9.3.7 1…



Tags: 文件json节点矩阵事务transactionweight亲和力
1条回答
网友
1楼 · 发布于 2024-09-28 01:25:03

首先,我将清理数据并用整数表示每个节点,然后从像这样的字典开始

data=[{'transaction': [1, 2, 7], 'weight': 0.41},
      {'transaction': [4, 2, 1, 3, 10, 7, 9], 'weight': 0.67},
      {'transaction': [3, 10, 11, 2, 1], 'weight': 0.33}]

不确定这是否足够Python,但它应该是不言而喻的

^{pr2}$

查看亲和力矩阵

import numpy as np
print(np.array(A))
    [[ 0.47  0.47  0.5   0.67  0.    0.    0.54  0.    0.67  0.5 ]
     [ 0.47  0.47  0.5   0.67  0.    0.    0.54  0.    0.67  0.5 ]
     [ 0.5   0.5   0.5   0.67  0.    0.    0.67  0.    0.67  0.5 ]
     [ 0.67  0.67  0.67  0.67  0.    0.    0.67  0.    0.67  0.67]
     [ 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
     [ 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
     [ 0.54  0.54  0.67  0.67  0.    0.    0.54  0.    0.67  0.67]
     [ 0.    0.    0.    0.    0.    0.    0.    0.    0.    0.  ]
     [ 0.67  0.67  0.67  0.67  0.    0.    0.67  0.    0.67  0.67]
     [ 0.5   0.5   0.5   0.67  0.    0.    0.67  0.    0.67  0.5 ]]

相关问题 更多 >

    热门问题