循环通过groupby为pandas中的name分配数字

import pandas as pd d = { 'ID': ['ID-1','ID-1','ID-1','ID-1','ID-2','ID-2','ID-2'], 'OBR':[100,100,100,100,200,200,200], 'OBX':['A','B','C','D','A','B','C'], 'notes':['hello','hello2','','','bye','',''], } df = pd.DataFrame(d)

ID OBR OBX notes ID-1 100 A hello ID-1 100 B hello2 ID-1 100 C ID-1 100 D ID-2 200 A bye ID-2 200 B ID-2 200 C

count = 0 grouped = df.groupby(['ID','OBR']) for a, group in grouped: ID = a[0] OBR = a[1] OBX+str(count) = group['OBX'] #this gives an error, can't use OBX+str(count) as the name note+str(count) = group['notes'] #this gives an error as well count +=1 #Is using count correct? print(....)

1条回答

网友
1楼 · 发布于 2024-09-25 02:34:37

一种方法是groupby到元组：
res = df.groupby(['ID', 'OBR'])\ .agg({'OBX': lambda x: tuple(x), 'notes': lambda x: tuple(filter(None, x))})\ .reset_index() print(res) ID OBR OBX notes 0 ID-1 100 (A, B, C, D) (hello, hello2) 1 ID-2 200 (A, B, C) (bye,)
然后用enumerate迭代行（如果适用）：
for row in res.itertuples(): print('\nID =', row.ID) print('OBR =', row.OBR) for i, obx in enumerate(row.OBX, 1): print('OBX'+str(i)+' =', obx) for i, note in enumerate(row.notes, 1): print('notes'+str(i)+' =', note)
结果：
ID = ID-1 OBR = 100 OBX1 = A OBX2 = B OBX3 = C OBX4 = D notes1 = hello notes2 = hello2 ID = ID-2 OBR = 200 OBX1 = A OBX2 = B OBX3 = C notes1 = bye

相关问题更多 >

编程相关推荐

热门问题

热门文章