python从fi中删除特定行

2024-10-02 22:33:29 发布

您现在位置:Python中文网/ 问答频道 /正文

我想删除这个Html文件中的特定行。我想看看字符串STARTDELETE在哪里,然后从那里移除+1到字符串ENDDELETE-1

为了更好地理解,我用“xxx”标记要删除的行。如何使用python实现它?在

<!DOCTYPE html>
<html lang="en">
<head>
  <title>Bootstrap Example</title>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/css/bootstrap.min.css">
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
  <script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.7/js/bootstrap.min.js"></script>
</head>
<body>
  <div class="container">
    <h2>Image Gallery</h2>
    <div class="row"> <!--STARTDELETE-->
      xxx<div class="col-xs-3">
        xxx<div class="thumbnail">
          xxx<a href="/w3images/lights.jpg" target="_blank">
          xxx<img  style="padding: 20px" src="xxx" alt="bla" >
          xxx<div class="caption">
            xxx<p>Test</p>
          xxx</div>
        xxx</a>
        xxx</div>
      xxx</div>
    </div> <!--ENDDELETE-->
  </div>
</body>
</html>

Tags: 字符串httpsdivsrccomhtmljsscript
2条回答

您可以首先将代码复制并粘贴到一个输入文件中,可能名为“输入文件,然后输出要保留的行输出.txt". 忽略要删除的行。在

w = open("output.txt", "w")  # your output goes here
delete = False
with open("input.txt") as file:
    for line in file:
        if "<! ENDDELETE >" in line:
            delete = False # stops the deleting
        if not delete:
            w.write(str(line))
        if "<! STARTDELETE >" in line:
            delete = True # starts the deleting
w.close() # close the output file

希望这有帮助!在

安装beautifulsoup4(一个HTML解析器/DOM操纵器)

读取数据,得到一个“DOM”(有点。。。a walkable structure)使用beautifulsoup,获取要清空的项,然后remove its children。在

在您的示例中,看起来您希望清空<div>(s)谁的class=row,对吗?假设您的HTML数据存储在一个名为data.html的文件中(在您的特定情况下,这可能不是这样的。。。会是请求的正文或类似的内容)

from bs4 import BeautifulSoup
with open('data.html', 'r') as page_f:
    soup = BeautifulSoup(page_f.read(), "html.parser")
    # In `soup` we have our "DOM tree"

divs_to_empty = soup.find("div", {'class': 'row'})
for child in divs_to_empty.findChildren():
    child.decompose()

print(soup.prettify())

该输出:

^{pr2}$

如果你要做DOM操作,我强烈建议你阅读并使用美丽的汤(它非常强大)

相关问题 更多 >