我想从包含JavaScript的HTML页面中获取数据。我读过几篇建议使用Selenium或PyQt4.QtWebKit的帖子,但可能我开始的步骤不对,我使用了requests
。在
我可以用PyExecJS或Pyv8这样的外部库来执行从响应中存储的JavaScript,还是应该向后移动并用Selenium编写代码?在
代码如下:
import requests
from bs4 import BeautifulSoup
data = {"redirect_url": "",
"site": "uk",
"login_username": "foo",
"login_password": "bar"}
with requests.Session() as s:
log = "https://secure.advfn.com/login/secure"
r = s.get("http://uk.advfn.com/")
soup = BeautifulSoup(r.content)
redirect_url = soup.select_one("#redirect_url")["value"]
site = soup.select_one("#site")["value"]
data["redirect_url"] = redirect_url
p = s.post(log, data=data)
print(p.content)
output=s.get('https://it.advfn.com/mercati/BIT/generaliG/ordini').content
这是我得到的HTML输出(请参阅http://pastebin.com/bwa1hWsv的整页):
^{pr2}$
目前没有回答
相关问题 更多 >
编程相关推荐