<p>请允许我针对这个新的问题提出一个新的答案。在</p>
<p>在尝试了一些仅使用<code>requests</code>和<code>urllib</code>的方法后,我认为使用<code>selenium</code>webdriver控制器更好。在</p>
<p>下面的代码将根据需要抓取表行。在</p>
<pre><code>from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup
url = 'https://parivahan.gov.in/rcdlstatus/'
# Optional: Getting "Headless" browser, ie suppressing the browser window from showing
chrome_options = Options()
chrome_options.add_argument("--headless")
# Let the driver open, fill and submit the form
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.get(url)
driver.delete_all_cookies()
wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.NAME, 'form_rcdl:j_idt34')))
input1 = driver.find_element_by_name('form_rcdl:tf_reg_no1')
input1.send_keys('GJ03KA')
input2 = driver.find_element_by_name('form_rcdl:tf_reg_no2')
input2.send_keys('0803')
driver.find_element_by_name('form_rcdl:j_idt34').click()
wait = WebDriverWait(driver, 10)
# Get the result table
try:
element = WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, "form_rcdl:j_idt63"))
)
result_html = driver.page_source
#print(result_html)
soup = BeautifulSoup(result_html, 'lxml')
print(soup.findAll('tr'))
except TimeoutException:
driver.quit()
print('Time out.')
</code></pre>
<p>下面演示了在soup中打印出表html标记的结果。在</p>
<p><a href="https://i.stack.imgur.com/wuGp0.png" rel="nofollow noreferrer"><img src="https://i.stack.imgur.com/wuGp0.png" alt="enter image description here"/></a></p>
<p>我希望在你尝试lol之前,政府不要发现并阻止这种方式</p>
<p>希望这有帮助!如果您感兴趣,可以参考以下参考资料:</p>
<ul>
<li>硒手册:<a href="http://selenium-python.readthedocs.io/" rel="nofollow noreferrer">http://selenium-python.readthedocs.io/</a></li>
<li>用Python驱动无头Chrome:<a href="https://duo.com/decipher/driving-headless-chrome-with-python" rel="nofollow noreferrer">https://duo.com/decipher/driving-headless-chrome-with-python</a></li>
<li>{a4提交表格:^</li>
</ul>