pythonxpath遍历段落并抓取<strong>

<div id="content_third"> <h3>Title1</h3> <p> <strong>District</strong> John Q Public <br> Susie B Private <p> <p> <strong>District</strong> Anna C Public <br> Bob J Private <p> <h3>Title1</h3> <p> <strong>District</strong> John Q Public <br> Susie B Private <p> <p> <strong>District</strong> Anna C Public <br> Bob J Private <p> </div>

1条回答

网友

1楼 · 发布于 2024-06-24 12:06:09

我假设每个地区都是某个实际名称的占位符，因此要获得每个地区比您要做的要简单得多，只需从每个strong中提取文本：

h = """<div id="content_third">
 <h3>Title1</h3>
 <p>
  <strong>District</strong>
  John Q Public <br>
  Susie B Private
 <p>
 <p>
  <strong>District</strong>
  Anna C Public <br>
  Bob J Private
 <p>
 <h3>Title1</h3>
 <p>
  <strong>District</strong>
  John Q Public <br>
  Susie B Private
 <p>
 <p>
  <strong>District</strong>
  Anna C Public <br>
  Bob J Private
 <p>
</div>"""

from lxml import html

tree = html.fromstring(h)

print(tree.xpath('//*[@id="content_third"]/p/strong/text()'))

相关问题更多 >

编程相关推荐

热门问题

热门文章

pythonxpath遍历段落并抓取<strong>

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >