检查DataFrame中的第n个值是否等于字符串中的第n个字符

df= c1 c2 c3 c4 c5 0 K 6 NaN Y V 1 H NaN g 5 NaN 2 U B g Y L s = 'HKg5' s1 = pd.Series(list(s), index=[f'c{x+1}' for x in range(len(s))]) df.loc[((df == s1) | (df.isna())).all(1)]

1条回答

网友

1楼 · 发布于 2024-10-08 18:29:41

从字符串创建一个helperSeries，并使用布尔逻辑进行筛选：

s1 = pd.Series(list(s), index=[f'c{x+1}' for x in range(len(s))])

# print(s1)    
# c1    H
# c2    K
# c3    g
# c4    5
# dtype: object

逻辑为df等于（==）。此值或（|）为nan（isna）
沿轴1使用all返回所有值均为True的行

df.loc[((df == s1) | (df.isna())).all(1)]

[外]

  c1   c2 c3 c4   c5
1  H  NaN  g  5  NaN

因此，作为一项功能，您可以：

def df_match_string(frame, string):
    s1 = pd.Series(list(string), index=[f'c{x+1}' for x in range(len(string))])
    return ((frame == s1) | (frame.isna())).all(1)

df_match_string(df, s)

[外]

0    False
1     True
2    False
dtype: bool

更新

我无法用提供的例子再现你的问题。我猜数据帧中的一些值可能有前导/尾随空格

在尝试上述解决方案之前，请尝试以下预处理步骤：

for col in df:
    df[col] = df[col].str.strip()

更新

相关问题更多 >

编程相关推荐

热门问题

热门文章

检查DataFrame中的第n个值是否等于字符串中的第n个字符

更新

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >