如何删除两个字符串之间的字符串，比如“stringx”和“stringy”，这两个字符串在datafram中可能出现多次

df = pd.read_excel("sample_excel.xlsx", header=None) def removeString(df): inf = df[0][1] infcopy = '' bol = False start = '*start*' end = '*stop*' inf.replace('* start *',start) #in case black space between start inf.replace('* stop *',end) #in case black space between start for i in range(len(inf)): if inf[i] == "*" and inf[i:i+len(start)] == start: bol = True if inf[i] == '*' and inf[i+1-len(end):i+1] == end: bol = False continue if bol == False: infcopy += inf[i] df[0][1] = infcopy

1条回答

网友

1楼 · 发布于 2024-06-02 09:57:18

我想它可能看起来像这样。你知道吗

import pandas as pd
import re

def removeString(df):
    pattern = r'(?:start(.*?)stop)'
    df[ColToRemove] = df[ColToRemove].apply(lambda x: re.sub(pattern, "",x))

例如

df = pd.DataFrame({'Col1':['startjustsomethingherestop']})

输出：

                         Col1
0  startjustsomethingherestop

然后

pattern = r'(?:start(.*?)stop)'
df['Col1'] = df['Col1'].apply(lambda x: re.sub(pattern, "", x))

输出：

  Col1
0

这里定义的regex模式将在找到以“start”开头、以“stop”结尾的字符串的匹配项并将其作为输出时删除所有内容

相关问题更多 >

编程相关推荐

热门问题

热门文章