如何在数据框中找到中间有下划线的文本的字符串

网友

1楼 · 编辑于 2024-05-17 08:21:14

尝试此操作以获取包含下划线的行：

df[df["Rhymes"].str.contains("_")]

或者这只是为了得到值：

df.loc[df["Rhymes"].str.contains("_"), "Rhymes"].values

网友

2楼 · 编辑于 2024-05-17 08:21:14

方法^{}返回一个布尔序列，它不返回所需的字符串。你知道吗

相反，您可以将自定义函数与str.split一起使用，将其应用于您的序列，删除空值并转换回数据帧：

df = pd.DataFrame({'Rhymes': ['Johny johny.yes_papa eating', 'sugar',
                              'No papa.open_mouth_ha ha ha']})

def get_underscores(x):
    return next((i for i in x.replace('.',' ').split() if '_' in i), None)

res = df['Rhymes'].apply(get_underscores).dropna().to_frame()

print(res)

          Rhymes
0       yes_papa
2  open_mouth_ha

网友

3楼 · 编辑于 2024-05-17 08:21:14

对于字符串，它应该是这样工作的。你知道吗

    string = "Johny johny yes_papa eating sugar No papa open_mouth_ha ha ha"
    def find_underscore(string):
        a = []
        for i in string.split():
            for j in i:
                if j == '_':
                    a.append(i)
        return a

对于数据帧列：

    new_list = []
    for index, row in df.iterrows():
        print(find_underscore(row["column_name"]))
        new_list.append(find_underscore(row["column_name"]))

    df.new_column = new_list

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何在数据框中找到中间有下划线的文本的字符串

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >