美丽组找不到元素

2024-09-27 19:29:50 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图写一个程序,提取以下网站的价格。我用selenium下载这个站点,然后尝试用beauthoulsoup或selenium本身来解析它。在

我确定我想要的信息总是^{cl1}$

<td class="totalPrice" colspan="3">
Total: £560
<span class="sr_room_reinforcement"></span>
</td>

由于某些原因,下面的查询从未找到任何总价。如果你能提出我做错什么的建议,我将不胜感激。在

^{pr2}$

Tags: 程序信息站点网站selenium价格classtd
1条回答
网友
1楼 · 发布于 2024-09-27 19:29:50

首先,您需要等待总价格加载到时。将^{}类与预期的precense_of_element_located条件一起使用。在

我还发现,您需要通过所需的功能覆盖浏览器的PhantomJS,假装不是PhantomJS。在

完整工作代码:

from selenium import webdriver
from selenium.webdriver import DesiredCapabilities
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup as bs

url = 'http://www.booking.com/searchresults.en-gb.html?label=gen173nr-17CAEoggJCAlhYSDNiBW5vcmVmaFCIAQGYAS64AQTIAQTYAQHoAQH4AQs;sid=1a43e0952558ac0ad0061d5b6523a7bc;dcid=1;checkin_monthday=4;checkin_year_month=2016-2;checkout_monthday=11;checkout_year_month=2016-2;city=-2601889;class_interval=1;csflt=%7B%7D;group_adults=7;group_children=0;highlighted_hotels=1192837;hp_sbox=1;label_click=undef;no_rooms=1;review_score_group=empty;room1=A%2CA%2CA%2CA%2CA%2CA%2CA;sb_price_type=total;score_min=0;si=ai%2Cco%2Cci%2Cre%2Cdi;ss=London;ssafas=1;ssb=empty;ssne=London;ssne_untouched=London&;order=price_for_two'

# setting a custom User-Agent
user_agent = (
    "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) " +
    "AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.57 Safari/537.36"
)

dcap = dict(DesiredCapabilities.PHANTOMJS)
dcap["phantomjs.page.settings.userAgent"] = user_agent

driver = webdriver.PhantomJS(desired_capabilities=dcap)
driver.get(url)

# wait for the total prices to become present
WebDriverWait(driver, 30).until(EC.presence_of_element_located((By.CSS_SELECTOR, ".totalPrice")))

content = driver.page_source
driver.close()

soup = bs(content, 'lxml')
for e in soup.select('.totalPrice'):
    print(e.text.strip())

它打印:

^{pr2}$

顺便说一句,你不需要BeautifulSoup-你可以locate elements with ^{}-它非常强大。以下是如何确定总价:

^{3}$

相关问题 更多 >

    热门问题