使用python beautifulsoup web刮取提取值错误

<div class="result-value" data-reactid=".0.0.3.0.0.3.$0.1.1"> <span data-reactid=".0.0.3.0.0.3.$0.1.1.0">1.1</span> <span class="result-value-unit" data-reactid=".0.0.3.0.0.3.$0.1.1.1">MB</span> </div>

2条回答

网友

1楼 · 编辑于 2024-09-27 09:23:55

假设您知道data-reactid的值，您可以得到如下正确的元素：

soup.findAll("span", {"data-reactid": ".0.0.3.0.0.3.$0.1.1.0"})

网友

2楼 · 编辑于 2024-09-27 09:23:55

同样，如果soup.find('span', {'data-reactid': '.0.0.3.0.0.3.$0.1.1.0'}).text有效，代码不会返回任何错误消息。您得到的结果消息至少显示了try...except...函数正在工作。我猜问题出在你的htmlfile上，它必须是bytes而不是str。我建议您修改一下代码，如下所示：

from urllib.request import urlopen

htmlfile = urlopen(url).read().decode('utf-8') # if errors occur here, try: htmlfile = urlopen(url).read().decode('utf-8', errors='ignore')

soup = BeautifulSoup(htmlfile, 'lxml')

然后继续剩下的

相关问题更多 >

编程相关推荐

热门问题

热门文章

使用python beautifulsoup web刮取提取值错误

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >