如何在数据框中查找任何位置都包含单个字符的句子

lookfor = '[' + re.escape("A-Za-z") + ']' tdata = pd.read_csv(fileinput, nrows=0).columns[0] skip = int(tdata.count(' ') == 0) tdata = pd.read_csv(fileinput, names=['sentences'], skiprows=skip) filtered = tdata[tdata.sentences.str.contains(lookfor, regex=True, na=False)] print(filtered) #a sample set ----------------------------- #hi, how are; you z #im w good thanks #How am I #good, what about you #my name is alex #K hello, alex how are you ! #it is a car #great news #thanks! ----------------------------- expected output ----------------------------- #hi, how are; you z #im w good thanks #How am I #K hello, alex how are you ! #it is a car -----------------------------

2条回答

网友

1楼 · 编辑于 2024-09-25 00:31:37

将^{}与一个具有单词边界的单词一起使用，并按^{}过滤：

df = df[df['sentences'].str.contains(r'\b\w{1}\b')]
print (df)
                     sentences
0           hi, how are; you z
1            im  w good thanks
2                    How  am I
5  K hello, alex how are you !
6                 it  is a car

编辑：对于排除A和I，您可以在比较之前使用replace：

df = df[df['sentences'].str.replace(r'\b[AI]\b', '').str.contains(r'\b\w{1}\b')]
print (df)
                     sentences
0           hi, how are; you z
1            im  w good thanks
5  K hello, alex how are you !
6                 it  is a car

或：

df = df[~df['sentences'].str.contains(r'\b[AI]\b') & 
         df['sentences'].str.contains(r'\b\w{1}\b')]
print (df)
                     sentences
0           hi, how are; you z
1            im  w good thanks
5  K hello, alex how are you !
6                 it  is a car

网友

2楼 · 编辑于 2024-09-25 00:31:37

尝试：

df.loc[df.sentences.str.contains(r"([^\w]|^)\w([^\w]|$)")]

产出：

                     sentences
0           hi, how are; you z
1            im  w good thanks
2                    How  am I
5  K hello, alex how are you !
6                 it  is a car

相关问题更多 >

编程相关推荐

热门问题

热门文章