<p>BeautifulSoup将找不到该表,因为该表从其引用点不存在。在这里,您告诉Selenium,如果它注意到一个元素还没有出现,就暂停Selenium驱动程序匹配程序</em>:</p>
<pre class="lang-py prettyprint-override"><code># This only works for the Selenium element matcher
driver.implicitly_wait(10)
</code></pre>
<p>然后,紧接着,您获得当前HTML状态(表仍然不存在),并将其放入BeautifulSoup的解析器中。BS4将无法看到该表,即使它稍后加载,因为它将使用您刚才给它的当前HTML代码:</p>
<pre class="lang-py prettyprint-override"><code># You now move the CURRENT STATE OF THE HTML PAGE to BeautifulSoup's parser
soup = BeautifulSoup(driver.page_source, 'lxml')
# As this is now in BS4's hands, it will parse it immediately (won't wait 10 seconds)
table = soup.find_all('table')
# BS4 finds no tables as, when the page first loads, there are none.
</code></pre>
<p>要解决这个问题,您可以要求Selenium尝试获取HTML表本身。由于Selenium将使用您之前指定的<code>implicitly_wait</code>,因此它将等待它存在,然后才允许其余的代码执行持久化。此时,当BS4接收到HTML代码时,表将在那里</p>
<pre class="lang-py prettyprint-override"><code>driver.implicitly_wait(10)
# Selenium will wait until the element is found
# I used XPath, but you can use any other matching sequence to get the table
driver.find_element_by_xpath("/html/body/div[2]/main/div/section/div[2]/div[1]/div/div/div/div/div/div/div[2]/div[6]/div/div[2]/table/tbody/tr[1]")
soup = BeautifulSoup(driver.page_source, 'lxml')
table = soup.find_all('table')
</code></pre>
<hr/>
<p>然而,这有点过分了。是的,您可以使用Selenium来解析HTML,但是您也可以使用<code>requests</code>模块(从您的代码中,我看到您已经导入了该模块)直接获取表数据</p>
<p>数据是从<a href="https://local.erstebank.hr/rproxy/webdocapi/fx/current" rel="nofollow noreferrer">this</a>端点异步加载的(您可以使用Chrome开发工具自己查找)。您可以将其与<code>json</code>模块配对,将其转换为格式良好的字典。这种方法不仅速度更快,而且资源密集度也低得多(Selenium必须打开整个浏览器窗口)</p>
<pre class="lang-py prettyprint-override"><code>from requests import get
from json import loads
# Get data from URL
data_as_text = get("https://local.erstebank.hr/rproxy/webdocapi/fx/current").text
# Turn to dictionary
data_dictionary = loads(data_as_text)
</code></pre>