试图爬网一个网站，但没有得到<body>

def loadUrl(adress): adress = urllib.unquote(adress) print("Loading " + adress) socket =urllib.urlopen(adress) html = socket.read() socket.close() soup = BeautifulSoup(html) return soup soup = loadUrl("http://de.pokerstrategy.com/forum/thread.php?threadid=498111")

3条回答

网友

1楼 · 编辑于 2024-10-05 12:20:23

编辑对不起，我不知道你已经发布了你想要检索的网址。我得到的答复和你一样，但不知道为什么。我在javascript中看不到任何东西，正如我在下面建议的那样。在

我测试了你的代码，它似乎运行得很好。可能您尝试检索的页面通过javascript或类似的方式生成body元素。在本例中，我相信您可以使用selenium之类的东西来模拟浏览器。在

网友

2楼 · 编辑于 2024-10-05 12:20:23

另外，我建议使用Pyquery。在

from pyquery import PyQuery
d = PyQuery("http://de.pokerstrategy.com/forum/thread.php?threadid=498111")

print d("body").html()

网友

3楼 · 编辑于 2024-10-05 12:20:23

我已经成功地将BeautifulSoup与urllib2结合使用，例如：

from urllib2 import urlopen
...
html = urlopen(...)
soup = BeautifulSoup(html)

相关问题更多 >

编程相关推荐

热门问题

热门文章