无法使用lxm从HTML中提取值

2024-09-29 22:00:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一段HTML(数字不同):

<span class="ng-binding"> <b>Total:</b> 68.71€ (459 items) </span>


我想从中提取68.71€ (459 items)


到目前为止,我试着用这段代码来实现它,只是从Google Chrome中将xpath复制到上面显示的span类中:

import urllib.request
from lxml import html
import os

ids =  ["ftpstorage1-730",
        "ftpstorage2-730",
        "ftpstorage3-730"]

for id in ids:

url = 'http://steam.tools/itemvalue/#/'+id
with urllib.request.urlopen(url) as response:
    site = response.read()
    tree = html.fromstring(site)
    data = tree.xpath('//*[@id="container"]/div[5]/span[1]/text()')

    print(data)

从理论上讲,这应该行得通,但行不通,我得到的只有:

[" {{(items | filter:dupesFilter | filter:typeFilter | filter:filterText |   sumByKey:'price':'count':
e}}\n\t\t\t\t({{items | filter:dupesFilter | filter:typeFilter |    filter:filterText | sumByKey:'count
[" {{(items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'price':'count':
e}}\n\t\t\t\t({{items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'count
[" {{(items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'price':'count':
e}}\n\t\t\t\t({{items | filter:dupesFilter | filter:typeFilter | filter:filterText | sumByKey:'count

知道我做错了什么吗?你知道吗

它是否与生成的数字有关,而不是静态的?你知道吗

如果是的话,我怎么还能提取这些数字呢?你知道吗


Tags: importidrequestcountitems数字urllibfilter
1条回答
网友
1楼 · 发布于 2024-09-29 22:00:57

您在控制台上看到的是带有AngularJS绑定占位符的未渲染HTML。您需要一个真正的浏览器来执行javascript,并让Angular将实际值放入占位符中。你知道吗

或者,如果您深入了解如何检索和计算总价,您可以不用使用真正的浏览器来解决这个问题。向提供idapp参数的http://item-value10.appspot.com/ParseInv端点发出GET请求,解析JSON响应并计算价格,同时考虑项目计数:

import requests


template_url = "http://item-value10.appspot.com/ParseInv"
ids = ["ftpstorage1-730", "ftpstorage2-730", "ftpstorage3-730"]

for id in ids:
    with requests.Session() as session:
        session.get('http://steam.tools/itemvalue/#/' + id)

        storage, app = id.split("-")
        url = template_url.format(storage=storage, app=app)

        response = session.get(url, params={
            "id": storage,
            "app": app
        }, headers={
            "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36",
            "Referer": "http://steam.tools/itemvalue/"
        })

        data = response.json()
        total = sum(float(item["price"]) * int(item["count"]) for item in data["items"])
        print(total)

印刷品:

20.439999999999998
78.16
0

相关问题 更多 >

    热门问题