在循环中重复替换字符串的一部分

def find_all(a_str, sub): start = 0 while True: start = a_str.find(sub, start) if start == -1: return yield start start += len(sub) # use start += 1 to find overlapping matches def replace_string(index1, index2, mainstring): replacementstring = '' return mainstring.replace(mainstring[index1:index2], replacementstring) def strip_images(html): begin_indexes = list(find_all(html, '<DESCRIPTION>GRAPHIC')) end_indexes = list(find_all(html, '</TEXT>')) for i in range(len(begin_indexes)): if begin_indexes[i] > end_indexes[i]: end_indexes.pop(0) else: if len(begin_indexes) == len(end_indexes): break for i in range(len(begin_indexes)): #code problem is here-- newhtml = replace_string(begin_indexes[i],end_indexes[i], html) if i == len(begin_indexes) - 1: return newhtml #code only returns one iteration var = strip_images(html) print var

2条回答

网友

1楼 · 编辑于 2024-09-27 09:28:34

开始工作了，下面是代码片段。它并不漂亮，但它的工作是删除这两个标记之间的文本：

def find_all(a_str, sub):
   start = 0
   while True:
    start = a_str.find(sub, start)
    if start == -1: return
    yield start
    start += len(sub) # use start += 1 to find overlapping matches

def strip_images(html):
begin_indexes = list(find_all(html, '<DESCRIPTION>GRAPHIC'))
end_indexes = list(find_all(html, '</TEXT>'))
for i in range(len(begin_indexes)):
    if begin_indexes[i] > end_indexes[i]:
        end_indexes.pop(0)
    else:
        if len(begin_indexes) == len(end_indexes):
            break

newhtml = html
begin_indexes2 = begin_indexes[::-1]
end_indexes2 = end_indexes[::-1]
for i in range(len(begin_indexes2)):
#for i, value in enumerate(begin_indexes,0):
    #end_indexes.reset_index(drop=True)
    newhtml = list(newhtml)
    del newhtml[begin_indexes2[i]:end_indexes2[i]]

    if i == len(begin_indexes2) - 1:
        str1 = ''.join(newhtml)
        return str1

网友

2楼 · 编辑于 2024-09-27 09:28:34

您当前的问题是html在循环中从不改变。因此，无论列表的长度如何，您的输入总是第一次迭代。你知道吗

这里的解决方案遵循以下步骤

将字符串赋给循环之前的原始值
在循环中编辑，传入当前内容，返回替换的字符串
循环后从函数返回

newhtml = html 
for begin, end in zip(begin_indexes, end_indexes):
    newhtml = replace_string(begin, end, newhtml)
return newhtml

相关问题更多 >

编程相关推荐

热门问题

热门文章