使用substring选择数据帧行的麻烦

Python 2.7版

In [3]:import pandas as pd df = pd.DataFrame(dict(A=['abc','abc','abc','xyz','xyz'], B='abcdef','abcdefghi','notthisone','uvwxyz','orthisone'])) In [4]: df Out[4]: A B 0 abc abcdef 1 abc abcdefghi 2 abc notthisone 3 xyz uvwxyz 4 xyz orthisone In [12]: df[df.B.str.contains(df.A) == True] # just keep the B that contain A string TypeError: 'Series' objects are mutable, thus they cannot be hashed

我正在努力做到这一点：

A B 0 abc abcdef 1 abc abcdefghi 3 xyz uvwxyz

我试过结构包含声明，但是不行。非常感谢您的帮助。你知道吗

3条回答

网友

1楼 · 编辑于 2024-10-03 00:24:54

您可以调用列“A”上的^{}，然后与|联接，以使用^{}创建匹配模式：

In [15]:
df[df['B'].str.contains('|'.join(df['A'].unique()))]

Out[15]:
     A          B
0  abc     abcdef
1  abc  abcdefghi
3  xyz     uvwxyz

网友

2楼 · 编辑于 2024-10-03 00:24:54

对行应用lambda函数并测试a是否在B中

>>> df[df.apply(lambda x: x.A in x.B, axis=1)]
     A          B
0  abc     abcdef
1  abc  abcdefghi
3  xyz     uvwxyz

网友

3楼 · 编辑于 2024-10-03 00:24:54

它看起来不像str.contains支持多种模式，因此您可能只需要在行上应用：

substr_matches = df.apply(lambda row: row['B'].find(row['A']) > -1, axis=1)

df.loc[substr_matches]
Out[11]: 
     A          B
0  abc     abcdef
1  abc  abcdefghi
3  xyz     uvwxyz

Python 2.7版

相关问题更多 >

编程相关推荐

热门问题

热门文章