如何使用apply遍历panda数据帧并访问下一个值？

import pandas as pd x = {'depth':[1,1,2,2,2,3,3,3,3],"Component":["A","B","C","D","E","F","G","H","I"]} y = {'depth':[1,1,2,2,2,3,3,3,3],"Component": ["A","B","C","D","E","F","G","H","I"] ,"parent":["None","None","B","B","B","E","E","E","E"]} x = pd.DataFrame(x) y = pd.DataFrame(y)

3条回答

网友

1楼 · 编辑于 2024-05-04 21:10:31

让我们用split和np.where

s = x.Component.str.rsplit('.',n=1)
x['parent'] = np.where(s.str.len()>1,s.str[0],np.nan)
x
Out[566]: 
   depth Component parent
0      1         A    NaN
1      1         B    NaN
2      2       B.1      B
3      2       B.2      B
4      2       B.3      B
5      3     B.3.1    B.3
6      3     B.3.2    B.3
7      3     B.3.3    B.3
8      3     B.3.4    B.3

网友

2楼 · 编辑于 2024-05-04 21:10:31

使用str.extract

x['parent'] = x['Component'].str.extract(r'(.*)\.')

因为“*”、“+”和“？”限定符都是贪婪的；它们匹配尽可能多的文本，以便表达式匹配到最后一个.。由extract返回的是( )之间的所有文本

>>> x
   depth Component parent
0      1         A    NaN
1      1         B    NaN
2      2       B.1      B
3      2       B.2      B
4      2       B.3      B
5      3     B.3.1    B.3
6      3     B.3.2    B.3
7      3     B.3.3    B.3
8      3     B.3.4    B.3

旧答案 使用str访问器：

x['parent'] = x['Component'].str.split('.') \
                            .str[:-1] \
                            .str.join('.') \
                            .replace('', pd.NA)

>>> x
   depth Component parent
0      1         A   <NA>
1      1         B   <NA>
2      2       B.1      B
3      2       B.2      B
4      2       B.3      B
5      3     B.3.1    B.3
6      3     B.3.2    B.3
7      3     B.3.3    B.3
8      3     B.3.4    B.3

网友

3楼 · 编辑于 2024-05-04 21:10:31

我认为这是一个有效的解决方案（我提出的），如果你们认为有一个更简单的方法，那么请让我知道

df = data
depth_lst = dict()
for row in df.head(200).iterrows():
    depth = row[1]["Depth"]
    if (depth not in depth_lst):
        depth_lst[int(depth)] = row[1]["Component"]
    else:
        depth_lst[int(depth)] = row[1]["Component"]
    if (depth == 1):
        row[1]["Parent Compoenet"] = "NA"
    else:
        row[1]["Parent Compoenet"] = depth_lst[depth - 1]

display(df.head(200))

相关问题更多 >

编程相关推荐

热门问题

热门文章