美化python解析html文件

2024-10-01 07:20:00 发布

男 | 程序猿一只，喜欢编程写python代码。

我使用beauthoulsoup将html文件中的所有逗号替换为&sbquo;。这是我的代码：

f = open(sys.argv[1],"r")
data = f.read()

soup = BeautifulSoup(data)

comma = re.compile(',') 


for t in soup.findAll(text=comma):
        t.replaceWith(t.replace(',', '&sbquo;'))

除非html文件中包含一些javascript，否则这些代码可以正常工作。在这种情况下，它甚至将javascript代码中的逗号（，）替换为。这不是必需的。我只想替换html文件的所有文本内容。在

Tags：文件代码 read data html sys open javascript

1条回答

网友

1楼 · 发布于 2024-10-01 07:20:00

^{}可以调用：

tags_to_skip = set(["script", "style"])
# Add to this list as needed

def valid_tags(tag):
    """Filter tags on the basis of their tag names

    If the tag name is found in ``tags_to_skip`` then
    the tag is dropped.  Otherwise, it is kept.
    """
    if tag.source.name.lower() not in tags_to_skip:
        return True
    else:
        return False

for t in soup.findAll(valid_tags):
    t.replaceWith(t.replace(',', '&sbquo;'))

美化python解析html文件

相关问题更多 >

编程相关推荐

热门问题

热门文章

美化python解析html文件

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >