python中biu邻接矩阵的列数

2024-09-26 18:06:04 发布

您现在位置:Python中文网/ 问答频道 /正文

我想从数据帧创建一个biu邻接矩阵。我编写了以下代码:

import pandas as pd
from sklearn.model_selection import train_test_split
import networkx as nx
from operator import itemgetter
from networkx.algorithms.bipartite import biadjacency_matrix


# create the biadjacency matrix, creating a graph with data's edges
def to_adjacency_matrix(data):
    g = nx.DiGraph()
    g.add_edges_from(data)
    partition_0 = set(map(itemgetter(0), data))
    partition_1 = set(map(itemgetter(1), data))
    return biadjacency_matrix(g, partition_0).toarray(), partition_0, partition_1


df = pd.read_csv("final.csv", sep=" ", header=1, names=["Tags", "Users"])

#train dataset 90% & test dataset 10%!!!
train, test = train_test_split(df, test_size=0.1)

# change columns order to have users first
train = train[['Users', 'Tags']]
test = test[['Users', 'Tags']]

data = list(zip(*[train[c].values.tolist() for c in train]))

bi_adjacency, user_node, tag_node = to_adjacency_matrix(data)
print(bi_adjacency.shape)
print(len(user_node))
print(len(tag_node))

虽然我的unique users是41576,unique tags是10000,但我的结果是一个shape数组(415766879)。 我不明白为什么最后一个不同于(4157610000)。 有人能帮我吗

谢谢你所做的一切


附加了一些行(第一个元素是标记,第二个元素是用户): (99203 214006) (17572 4935) (45786 31658) (870 34) (37429 11146) (11215 819) (20630 119669) (12635 9361) (4059 1500) (18527 1134) (8947 324122) (6485 91299) (134203 34) (30798 67598) (75004 14112) (138671 17180) (65591 1711) (59503 17702) (11073 38071) (9454 3567) (22172 1318) (33324 17162) (127867 778) (78174 8210) (9298 68) (147376 16097) (11537 502821) (146005 7271) (1188 4080) (8741 75363) (39363 2293) (75308 13957) (2663 431) (93042 7599) (54620 33807) (112129 38447) (21496 601) (122439 874) (4038 7576) (30091 1238) (24843 250) (3502 415) (41202 2929) (18628 28876) (58895 33987) (153769 530) (23866 34810) (61654 894) (35460 3665) (7473 22429) (6623 733) (149858 9710) (59320 6388) (4623 1026) (23382 1923) (6621 5325) (7401 60307) (101399 1641) (104702 56767) (20873 1369) (147376 2719) (2183 258) (46308 3360) (53485 12) (50744 430) (80858 39637) (123323 100590) (7467 155888) (35876 2615) (47908 129774) (76826 245) (15210 5797)


Tags: tofromtestimportnodedatatagstrain

热门问题