Python和重新编译返回不一致的结果

<p> <a href="../personal-autonomy/">autonomy: personal</a> | <a href="../principle-beneficence/">beneficence, principle of</a> | <a href="../decision-capacity/">decision-making capacity</a> | <a href="../legal-obligation/">legal obligation and authority</a> | <a href="../paternalism/">paternalism</a> | <a href="../identity-personal/">personal identity</a> | <a href="../identity-ethics/">personal identity: and ethics</a> | <a href="../respect/">respect</a> | <a href="../well-being/">well-being</a> </p>

3条回答

网友

1楼 · 编辑于 2024-10-01 00:34:36

我想我发现了问题

reg = re.compile(r'<a href="../(.*?)">')

for match in re.findall(reg, input_html):

output_html = input_html.replace(match, match+'index.html')

这里在for循环中修改“input_html”，然后再次搜索相同的“input_html”以查找正则表达式，这是错误：）

网友

2楼 · 编辑于 2024-10-01 00:34:36

Don't parse html with regexs:

import re    
from lxml import html

def replace_link(link):
    if re.match(r"\.\./[^/]+/$", link):
        link += "index.html"
    return link

print html.rewrite_links(your_html_text, replace_link)

输出

^{pr2}$

网友

3楼 · 编辑于 2024-10-01 00:34:36

你的平手是不是逃过了前两个.？在

reg = re.compile(r'<a[ ]href="[.][.]/(.*?)">')

但我会试着用lxml来代替。在

输出

相关问题更多 >

编程相关推荐

热门问题

热门文章