将条件行数据合并到新的datafram中

time node txrx src dest txid hops 0 34355146 2 TX 2 1 1 NaN 1 34373907 1 RX 2 1 1 1.0 2 44284813 2 TX 2 1 2 NaN 3 44302557 1 RX 2 1 2 1.0 4 44596500 3 TX 3 1 2 NaN 5 44630682 1 RX 3 1 2 2.0 6 50058251 2 TX 2 1 3 NaN 7 50075994 1 RX 2 1 3 1.0 8 51338658 3 TX 3 1 3 NaN 9 51382629 1 RX 3 1 3 2.0

tx_time rx_time src dest txid hops 0 34355146 34373907 2 1 1 1 1 44284813 44302557 2 1 2 1 2 44596500 44630682 3 1 2 2 3 50058251 50075994 2 1 3 1 4 51338658 51382629 3 1 3 2

3条回答

网友

1楼 · 编辑于 2024-09-28 19:20:11

一种defaultdict方法
从OP的角度来看，这可能会更快。
如果速度很重要，请检查。基督教青年会。你知道吗

from collections import defaultdict

d = defaultdict(lambda: defaultdict(dict))
cols = 'tx_time  rx_time  src  dest  txid  hops'.split()

for t in df.itertuples():
    i = (t.src, t.dest, t.txid)
    d[t.txrx.lower() + '_time'][i] = t.time
    if pd.notnull(t.hops):
        d['hops'][i] = int(t.hops)

pd.DataFrame(d).rename_axis(['src', 'dest', 'txid']) \
  .reset_index().reindex_axis(cols, 1)

    tx_time   rx_time  src  dest  txid  hops
0  34355146  34373907    2     1     1     1
1  44284813  44302557    2     1     2     1
2  50058251  50075994    2     1     3     1
3  44596500  44630682    3     1     2     2
4  51338658  51382629    3     1     3     2

网友

2楼 · 编辑于 2024-09-28 19:20:11

使用concat虽然我认为@Wen使用pivot的解决方案会更有效

df_tx = df[::2].reset_index().drop(['index', 'txrx', 'node'], axis = 1).rename(columns = {'time': 'tx_time'})
df_rx = df[1::2].reset_index().drop(['index', 'txrx', 'node'], axis = 1).rename(columns = {'time': 'rx_time'})

pd.concat([df_tx, df_rx ], axis = 1).T.drop_duplicates().T.dropna(1)

你得到了吗

    tx_time     src dest    txid    rx_time     hops
0   34355146.0  2.0 1.0     1.0     34373907.0  1.0
1   44284813.0  2.0 1.0     2.0     44302557.0  1.0
2   44596500.0  3.0 1.0     2.0     44630682.0  2.0
3   50058251.0  2.0 1.0     3.0     50075994.0  1.0
4   51338658.0  3.0 1.0     3.0     51382629.0  2.0

网友

3楼 · 编辑于 2024-09-28 19:20:11

通过使用pivot_table

df.bfill().pivot_table(index=['src','dest','txid','hops'],columns=['txrx'],values='time').reset_index()
Out[766]: 
txrx  src  dest  txid  hops        RX        TX
0       2     1     1   1.0  34373907  34355146
1       2     1     2   1.0  44302557  44284813
2       2     1     3   1.0  50075994  50058251
3       3     1     2   2.0  44630682  44596500
4       3     1     3   2.0  51382629  51338658

或者使用unstack

df.bfill().set_index(['src','dest','txid','hops','txrx']).time.unstack(-1).reset_index()
Out[768]: 
txrx  src  dest  txid  hops        RX        TX
0       2     1     1   1.0  34373907  34355146
1       2     1     2   1.0  44302557  44284813
2       2     1     3   1.0  50075994  50058251
3       3     1     2   2.0  44630682  44596500
4       3     1     3   2.0  51382629  51338658

PS:使用.rename(columns={})重命名我没有添加到这里，因为会使代码太长。。。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章