使用NetworkX和Gephi在高度连接的网络中查找社区

2024-10-01 13:41:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图查看句柄中的社区,然后是一组给定的用户。我收集了给定用户集后面的所有句柄,并将它们裁剪成最相关的句柄(即删除那些在集合中只有少量关注者的句柄,这些句柄本质上是噪音)。在本例中,给定的用户数是69,我尝试集群的句柄数是435。在

我用NetworkX来建立这个网络的图形。每个跟随的控制柄构成一个节点,所有由给定用户跟随的控制柄成对组合都是无向边。例如,给定的user1跟在handle1、handle2和handle3之后:handle1、handle2和handle3是节点,handle1-handle2、handle1-handle3、handle2-handle3是边。最终得到435个节点和81182个边。在

然后,我将这个图导出到Gephi进行分析,但它似乎过于相互关联,无法提取任何有趣的内容。找到图的模块化只会给出两个巨大的、几乎没有帮助的社区。我尝试了各种方法来加权边和节点,但似乎没有得到任何有意义的东西。也许我需要确定这些边中的一些是不相关的,但我不确定如何确定。当我查看每个节点的边时,权重最高的边实际上是最密切相关的句柄,但这在模块化分析中没有体现出来。在

我的代码在下面,谁能为我如何找到这里的社区提供指导吗?在

# build a network of followedUsers
followedUsersGraph = nx.Graph()

# followedUsersSorted is a Pandas series of handles followed with userid
# as the id and the number of users from the set following as the value
for i, user in enumerate(followedUsersSorted.iteritems()):
    followedUsersGraph.add_node(i)
    followedUsersGraph.node[i]['user'] = str(user[0])
    followedUsersGraph.node[i]['weight'] = int(user[1])

# followedUsersMatrix is a Pandas DataFrame acting as a binary matrix
# with rows of given users and columns of followed handles
# convert the column labels to node ids
followedUsersMatrix.columns = range(len(followedUsersMatrix.columns))

# convert the matrix into tuple of edge tuples w/ weight
edgeTuples = []
for _, vector in followedUsersMatrix.iterrows():
    # each user from the set can only provide a total weight of 1
    # the more handles they follow the less weight they contribute to the edge
    edges = [t for t in combinations(vector[vector != 0].index, 2)]
    weight = 1.0/len(edges)
    edgeTuples.extend([(edge, weight) for edge in edges])

# add the edges to the graph incrementing the weight for repeated edges
for edge, weight in edgeTuples:
    if followedUsersGraph.has_edge(*edge):
        followedUsersGraph[edge[0]][edge[1]]['weight'] += weight
    else:
        followedUsersGraph.add_edge(*edge, weight = weight)

nx.write_gexf(followedUsersGraph, 'fug.gexf')

Tags: oftheinnodefor节点句柄weight