我遇到了一个意想不到的问题,我正在使用python3.5和BeautifulSoup。 我要分析以下链接:
url = 'https://www.leboncoin.fr/chaussures/627533472.htm?ca=16_s'
import requests, bs4
res = requests.get(url)
res.raise_for_status()
DicoSoup = bs4.BeautifulSoup(res.text, "lxml")
我有兴趣检索到提供的图片链接。 当我检查网站的html时,我发现在tag div下面有“thumbnails”类,它们在tag span下面有“item\u imagePic”类,它们是img标签
但是,当我选择div标记时,却找不到span标记:
div = DicoSoup.select("div.thumbnails")
div
Out[54]:
[<div class="thumbnails" data-alt="Talons aiguilles Stéphane Kélian - 37.5">
<ul>
<li class="thumb selected trackable" data-info='{"event_name" : "ad_view::photos", "event_type" : "click", "click_type" : "N", "event_s2" : "2"}' id="thumb_0"></li>
<li class="thumb trackable" data-info='{"event_name" : "ad_view::photos", "event_type" : "click", "click_type" : "N", "event_s2" : "2"}' id="thumb_1"> </li>
<li class="thumb trackable" data-info='{"event_name" : "ad_view::photos", "event_type" : "click", "click_type" : "N", "event_s2" : "2"}' id="thumb_2"></li>
</ul>
</div>]
当我检查html内容时,我看到的是:
<div class="thumbnails" data-alt="Talons aiguilles Stéphane Kélian - 37.5" style="width: 596px;">
<ul style="">
<li id="thumb_0" class="thumb selected trackable" data-info="{"event_name" : "ad_view::photos", "event_type" : "click", "click_type" : "N", "event_s2" : "2"}"><span class="item_imagePic"><img src="//img0.leboncoin.fr/thumbs/d89/d89c778e852e4a175d5d1ba96b2ec9c220445732.jpg" alt="Talons aiguilles Stéphane Kélian - 37.5"></span></li>
<li id="thumb_1" class="thumb trackable" data-info="{"event_name" : "ad_view::photos", "event_type" : "click", "click_type" : "N", "event_s2" : "2"}"><span class="item_imagePic"><img src="//img1.leboncoin.fr/thumbs/7d9/7d9b62d9efd2187472dc16ca2794be1bbaeb1370.jpg" alt="Talons aiguilles Stéphane Kélian - 37.5"></span></li>
<li id="thumb_2" class="thumb trackable" data-info="{"event_name" : "ad_view::photos", "event_type" : "click", "click_type" : "N", "event_s2" : "2"}"><span class="item_imagePic"><img src="//img2.leboncoin.fr/thumbs/288/28865002bb34bad516574bd1e9b42d2a2bb928f2.jpg" alt="Talons aiguilles Stéphane Kélian - 37.5"></span></li>
</ul>
</div>
怎么可能? 我需要做什么来选择它们?你知道吗
我试过:
div = DicoSoup.select_one("div.thumbnails span.item_imagePic")
div = DicoSoup.select_one("div.thumbnails ul li span.item_imagePic")
div = DicoSoup.select("div.thumbnails ul li span.item_imagePic")
span = DicoSoup.find('span', {'class': 'item_imagePic'})
span = DicoSoup.find('span',id="thumb_0")
div = DicoSoup.select("div.thumbnails img")
div = DicoSoup.select("div.thumbnails span img")
div = DicoSoup.select("div.thumbnails ul li span.item_imagePic img")
它们都返回“NoneType”类型的对象
谢谢你
正如我所评论的,缩略图是使用JS动态生成的,但是您可以获取脚本并解析路径:
这给了你:
要获取图像链接:
或者只是分割线和条带:
两者都给你:
最后,您只需要预先准备一个方案,即https:
相关问题 更多 >
编程相关推荐