BeautifulSoup找不到正确解析的元素

2条回答

网友

1楼 · 编辑于 2024-09-29 23:20:29

即使正确答案是“使用另一个解析器”（谢谢@alecxe），我还有另一个解决方法。出于某种原因，这也很有效：

soup = BeautifulSoup(document, "html5lib")
soup = BeautifulSoup(soup.prettify(), "html5lib")
print soup.find_all('a')

它返回相同的链接列表：

^{pr2}$

网友

2楼 · 编辑于 2024-09-29 23:20:29

当要解析格式不好且复杂的HTML时，the parser choice非常重要：

There are also differences between HTML parsers. If you give Beautiful Soup a perfectly-formed HTML document, these differences won’t matter. One parser will be faster than another, but they’ll all give you a data structure that looks exactly like the original HTML document.
But if the document is not perfectly-formed, different parsers will give different results.

html.parser为我工作：

from bs4 import BeautifulSoup
import requests

document = requests.get('http://www.wvdnr.gov/').content
soup = BeautifulSoup(document, "html.parser")
print soup.find_all('a')

演示：

^{pr2}$

另请参见：

Differences between parsers。在

相关问题更多 >

编程相关推荐

热门问题

热门文章

BeautifulSoup找不到正确解析的元素

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >