使用lxm刮取数据时的xpath用法

2024-09-30 12:33:44 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试编写一个python脚本来从网页中获取数据。但是,我无法正确使用xpath来检索值。请帮我修一下这个。你知道吗

有问题的url是https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuoteFO.jsp?underlying=NIFTY&instrument=OPTIDX&strike=10400.00&type=CE&expiry=30NOV2017

我正在尝试获取VWAP value的值,目前是27.16(这个值每工作日都会改变)

<span id="vwap">27.16</span>

根据在线教程,我编写了以下python脚本

from lxml import html
import requests
page = requests.get('https://www.nseindia.com/live_market/dynaContent/live_watch/get_quote/GetQuoteFO.jsp?underlying=NIFTY&instrument=OPTIDX&strike=10400.00&type=CE&expiry=30NOV2017')
tree = html.fromstring(page.content)
vwap = tree.xpath('//span[@id="vwap"]/text()')
print(vwap)

但是当我执行这个命令时,我得到以下输出

[]

而不是

27.16

我也尝试过根据stackoverflow的其他答案将xpath行替换为following,但是仍然没有得到正确的输出。你知道吗

vwap = tree.xpath('//*[@id="vwap"]/text()')

请让我知道在xpath中放置什么,以便在vwap变量中获得正确的值。你知道吗

也欢迎使用任何其他解决方案(lxml除外)。你知道吗


Tags: https脚本comidlivetreegetwww
1条回答
网友
1楼 · 发布于 2024-09-30 12:33:44

如果要按最初显示的方式检查页面源,则必需的节点将如下所示

<li><a style="color: #000000;" title="VWAP">VWAP</a> <span id="vwap"></span></li>

而这是JavaScript执行后的显示方式

<li><a style="color: #000000;" title="VWAP">VWAP</a> <span id="vwap">27.16</span></li>

请注意,第一个HTML示例中没有文本内容

似乎价值观来自于下面的节点

<div id="responseDiv" style="display:none">
{"valid":"true","isinCode":null,"lastUpdateTime":"29-NOV-2017 15:30:30","ocLink":"\/marketinfo\/sym_map\/symbolMapping.jsp?symbol=NIFTY&instrument=-&date=-&segmentLink=17&symbolCount=2","tradedDate":"29NOV2017","data":[{"change":"-17.80","sellPrice1":"13.80","buyQuantity3":"450","sellPrice2":"13.85","buyQuantity4":"150","buyQuantity1":"13,725","ltp":"-243019.52","buyQuantity2":"6,225","sellPrice5":"14.00","sellPrice3":"13.90","buyQuantity5":"450","sellPrice4":"13.95","underlying":"NIFTY","bestSell":"-2,41,672.50","annualisedVolatility":"9.44","optionType":"CE","prevClose":"31.10","pChange":"-57.23","lastPrice":"13.30","lowPrice":"11.00","strikePrice":"10400.00","premiumTurnover":"11,707.33","numberOfContractsTraded":"5,74,734","underlyingValue":"10,361.30","openInterest":"58,96,350","impliedVolatility":"12.73","vwap":"27.16","totalBuyQuantity":"10,49,850","openPrice":"35.10","closePrice":"17.85","bestBuy":"-2,43,852.25","changeinOpenInterest":"1,60,800","clientWisePositionLimits":"30517526","totalSellQuantity":"11,07,825","dailyVolatility":"0.49","sellQuantity5":"19,800","marketLot":"75","expiryDate":"30NOV2017","marketWidePositionLimits":"-","sellQuantity2":"75","sellQuantity1":"3,825","buyPrice1":"13.00","sellQuantity4":"900","buyPrice2":"12.90","sellQuantity3":"2,025","buyPrice4":"12.75","buyPrice3":"12.80","buyPrice5":"12.65","turnoverinRsLakhs":"44,94,632.53","pchangeinOpenInterest":"2.80","settlementPrice":"-","instrumentType":"OPTIDX","highPrice":"40.85"}],"companyName":"Nifty 50","eqLink":""}
</div>

所以你可能需要的代码是

import json

vwap = json.loads(tree.xpath('//div[@id="responseDiv"]/text()')[0].strip())['data'][0]['vwap']

相关问题 更多 >

    热门问题