Pandas：如何根据列元素的组合来分组，以根据不同列的值来指示协同发生？

Batch_ID Product_ID 1 A 1 B 1 C 2 B 2 B 2 C 2 C 3 B 3 B 3 C 4 C 4 D 5 D

2条回答

网友

1楼 · 编辑于 2024-10-01 00:30:05

使用NetworkX API：

In [225]: G = nx.from_pandas_edgelist(df, 'Batch_ID', 'Product_ID')

In [226]: from networkx.algorithms import bipartite

In [227]: W = bipartite.weighted_projected_graph(G, df['Product_ID'].unique())

In [228]: W.edges(data=True)
Out[228]: EdgeDataView([('A', 'C', {'weight': 1}), ('A', 'B', {'weight': 1}), ('B', 'C', {'weight': 3}), ('C', 'D', {'weight': 1})])

In [229]: nx.to_pandas_edgelist(W)
Out[229]:
  source target  weight
0      A      C       1
1      A      B       1
2      B      C       3
3      C      D       1

注意：对于NetworkX版本1.x，使用from_pandas_dataframe()和{}，而不是{}和{}

网友

2楼 · 编辑于 2024-10-01 00:30:05

以下是我的看法：

from itertools import combinations

def combine(batch):
    """Combine all products within one batch into pairs"""
    return pd.Series(list(combinations(set(batch), 2)))

edges = df.groupby('Batch_ID')['Product_ID'].apply(combine).value_counts()
edges
#(B, C)    3
#(A, B)    1
#(A, C)    1
#(D, C)    1

我知道不需要0次出现的边。在

如果需要，可以将索引进一步拆分为源和目标：

^{pr2}$

或者：

c = ['Source', 'Target']
L = edges.index.values.tolist()
edges = pd.DataFrame(L, columns=c).join(edges.reset_index(drop=True))

相关问题更多 >

编程相关推荐

热门问题

热门文章

Pandas：如何根据列元素的组合来分组，以根据不同列的值来指示协同发生？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >