解析字符串以获取给定的属性值

2024-09-19 23:29:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含以下内容的字符串:

var string = 
'<div class="product-info-inner-content clearfix ">\
    <a href="http://www.adidas.co.uk/ace-17_-purecontrol-firm-ground-boots/BB4314.html"\
      class="link-BB4314 product-link clearfix "\
      data-context="name:ACE 17+ Purecontrol Firm Ground Boots"\
      data-track="BB4314"\
      data-productname="ACE 17+ Purecontrol Firm Ground Boots"  tabindex="-1">\
        <span class="title">ACE 17+ Purecontrol Firm Ground Boots</span>\
        <span class="subtitle">Men Football</span>\
    </a>\
</div>';

我尝试执行与下面Python代码等价的JavaScript,其中使用beautiful soup获取给定产品代码(即本例中的BB4314)的div class元素的URL。你知道吗

 is_listing = len(soup.findAll(name="div", attrs={"class": "product-tile"})) > 1
        if is_listing:
        # stuck from this part
        attrs = {"class": re.compile(r".*\bproduct-link\b.*"), "data-track": code} 
        url = soup.find(name="a", attrs=attrs)
        url = url["href"]

我该怎么做?你知道吗


Tags: namedivdatalinkproductattrsclassspan
1条回答
网友
1楼 · 发布于 2024-09-19 23:29:16

就用DOM吧

var string = '<div class="product-info-inner-content clearfix "><a href="http://www.adidas.co.uk/ace-17_-purecontrol-firm-ground-boots/BB4314.html" class="link-BB4314 product-link clearfix " data-context="name:ACE 17+ Purecontrol Firm Ground Boots" data-track="BB4314" data-productname="ACE 17+ Purecontrol Firm Ground Boots" tabindex="-1"><span class="title">ACE 17+ Purecontrol Firm Ground Boots</span> <span class="subtitle">Men Football</span></a></div>', div = document.createElement("div"); div.innerHTML = string; var href = div.querySelector("a.product-link").href, parts = href.split("/"), code = parts.pop().split(".")[0]; console.log(code) console.log(div.querySelector("a.product-link").getAttribute("data-track"))

相关问题 更多 >