如何通过单击浏览器中的“inspect element”获取javascript生成的html?

2024-06-13 19:45:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试从此网页(日历下面的框)获取可用时段的小时数:

https://magicescape.it/le-stanze/lo-studio-di-harry-houdini/

我已经阅读了其他相关问题并编写了此代码

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support.expected_conditions import presence_of_element_located
from selenium.webdriver.firefox.options import Options
from bs4 import BeautifulSoup

url = 'https://magicescape.it/le-stanze/lo-studio-di-harry-houdini/'
wait_time = 10
options = Options()
options.headless = True

driver = webdriver.Firefox(options=options)
driver.get(url)
driver.switch_to.frame(0)

wait = WebDriverWait(driver, wait_time)
first_result = wait.until(presence_of_element_located((By.ID, "sb_main")))

soup = BeautifulSoup(driver.page_source, 'html.parser')
print(soup)

driver.quit()

在切换到包含时隙的iframe之后,我通过打印soup得到了这个结果

<script id="time_slots_view" type="text/html"><div class="slots-view{{#ifCond (getThemeOption 'timeline_modern_display') '==' 'as_table'}} as-table{{/ifCond}}">
    <div class="timeline-wrapper">
        <div class="tab-pd">
            <div class="container-caption">
                {{_t 'available_services_on_this_day'}}
            </div>

            {{#if error_message}}
                <div class="alert alert-danger alert-dismissible" role="alert">
                    {{error_message}}
                </div>
            {{/if}}

            {{>emptyTimePart is_empty=is_empty is_loaded=is_loaded}}

            <div id="sb_time_slots_container"></div>
            {{> bookingTimeLegendPart legend="only_available" time_diff=0}}
        </div>
    </div>
</div></script>
<script id="time_slot_view" type="text/html"><div class="slot">
    <a class="sb-cell free {{#ifPluginActive 'slots_count'}}{{#if available_slots}}has-available-slot{{/if}}{{/ifPluginActive}}" href="#{{bookingStepUrl time=time date=date}}">
        {{formatDateTime datetime 'time' time_diff}}

        {{#ifCond (getThemeOption 'timeline_show_end_time') '==' 1}}
            -<span class="end-time">
                &nbsp;{{formatDateTime end_datetime 'time' time_diff}}
            </span>
        {{/ifCond}}

        {{#ifPluginActive 'slots_count'}}
            {{#if available_slots}}
                <span class="slot--available-slot">
                    {{available_slots}}
                    {{#ifConfigParam 'slots_count_show_total' '==' true}} / {{total_slots}} {{/ifConfigParam}}
                </span>
            {{/if}}
        {{/ifPluginActive}}
    </a>
</div></script>

而从右键单击>;检查网页中的元素我得到这个

<div class="slots-view">
  <div class="timeline-wrapper">
    <div class="tab-pd">
      <div class="container-caption">
        Orari d'inizio disponibili
      </div>
      <div id="sb_time_slots_container">
        <div class="slot">
          <a class="sb-cell free " href="#book/location/4/service/6/count/1/provider/6/date/2020-03-09/time/23:00:00/">
            23:00
          </a>
        </div>
      </div>
      <div class="time-legend">
        <div class="available">
          <div class="circle">
          </div>
          - Disponibile
        </div>
      </div>
    </div>
  </div>
</div>

如何使用selenium获取可用插槽的小时数(本例中为23:00)


Tags: fromimportdiviftimedriverseleniumclass
1条回答
网友
1楼 · 发布于 2024-06-13 19:45:11

要获得所需的响应,您需要:

  1. 正确标识要切换到的iframe(并切换到它)。您试图切换到frame[0],但需要frame[1]。下面的代码消除了对索引的依赖,而是使用xpath
  2. 获取包含时间的元素。同样地,它使用xpath来标识元素的所有子divid=sb_time_slots_container
  3. 然后我们迭代这些子div并获得text属性,它嵌套在这些div<a>

对于第1步和第2步;2您还应该使用wait.until,以便可以加载内容

...
driver.get(url)
wait = WebDriverWait(driver, wait_time)

# Wait until the iframe exists then switch to it
iframe_element = wait.until(presence_of_element_located((By.XPATH, '//*[@id="prenota"]//iframe')))
driver.switch_to.frame(iframe_element)

# Wait until the times exist then get an array of them
wait.until(presence_of_element_located((By.XPATH, '//*[@id="sb_time_slots_container"]/div')))
all_time_elems = driver.find_elements_by_xpath('//*[@id="sb_time_slots_container"]/div')

# Iterate over each element and print the time out
for elem in all_time_elems:
    print(elem.find_element_by_tag_name("a").text)

driver.quit()

相关问题 更多 >