使用带有下拉选项的Python请求模块

2024-09-30 06:18:11 发布

您现在位置:Python中文网/ 问答频道 /正文

我正试图从这个网页上获取信息:https://www.tmea.org/programs/all-state/history

我想从第一个下拉菜单中选择几个选项,并使用Beautiful Soup来拉取我需要的信息。首先,我试着用靓汤提取不同的选择:

import requests
from bs4 import BeautifulSoup

page = requests.get('https://www.tmea.org/programs/all-state/history')

soup = BeautifulSoup(page.text, 'html.parser')

body = soup.find(id = 'organization')
options = body.find_all('option')

for name in options:
    child = name.contents[0]
    print(child)

这对于拉取不同的选项很有效,但是我希望能够提交一个特定的选项并获取该信息。我试着补充:

^{pr2}$

我以前在其他使用POST的页面中使用过这个,我不太明白为什么这个例子不同。使用下拉选项是否意味着我必须使用诸如Selenium之类的东西?我以前用过,但我不知道怎么和靓汤一起用。在


Tags: httpsorgimport信息www选项pageall
1条回答
网友
1楼 · 发布于 2024-09-30 06:18:11

1)我没有看到在XHR和Fetch中使用POST(见下面的编辑)

2)是的,你可以用硒来做这个。像平常一样用硒来做桌子。一旦表被渲染,你就可以把它输入BeautifulSoup。例如:

url = 'https://www.tmea.org/programs/all-state/history'

driver = webdriver.Chrome()
driver.get(url)

# Your code to find/select the drop down menu and select 2018 Treble Choir
...
...

#Once that page is rendered...
soup = BeautifulSoup(driver.page_source, 'html.parser')

老实说,我不会为此费心使用beauthoulsoup,因为它看起来像是一个<table>标记。让熊猫来做这个工作:

^{pr2}$

编辑

我在Doc下找到了POST请求方法。您需要在有效负载中包含更多参数:

import pandas as pd
import requests

payload = {
'organization': '2018 Treble Choir',
'instrument': 'All',
'school_op': 'eq',
'school': '',
'city_op': 'eq',
'city': '',
's': '',
'submit': 'Search'}


r = requests.post('https://www.tmea.org/programs/all-state/history', data = payload)
print(r.text)

tables = pd.read_html(r.text)
table = tables[0]

输出:

print (table)
                       0       ...                     4
0    Year - Organization       ...                  City
1                    NaN       ...                   NaN
2      2018 Treble Choir       ...               El Paso
3      2018 Treble Choir       ...          Flower Mound
4      2018 Treble Choir       ...               Helotes
5      2018 Treble Choir       ...                Canyon
6      2018 Treble Choir       ...               Mission
7      2018 Treble Choir       ...                Belton
8      2018 Treble Choir       ...             Mansfield
9      2018 Treble Choir       ...                 Wylie
10     2018 Treble Choir       ...               El Paso
11     2018 Treble Choir       ...           San Antonio
12     2018 Treble Choir       ...              Beeville
13     2018 Treble Choir       ...         Grand Prairie
14     2018 Treble Choir       ...           San Antonio
15     2018 Treble Choir       ...           Brownsville
16     2018 Treble Choir       ...               Houston
17     2018 Treble Choir       ...               Woodway
18     2018 Treble Choir       ...                  Katy
19     2018 Treble Choir       ...                Canyon
20     2018 Treble Choir       ...               Crowley
21     2018 Treble Choir       ...           Trophy Club
22     2018 Treble Choir       ...              Amarillo
23     2018 Treble Choir       ...             Deer Park
24     2018 Treble Choir       ...                Dallas
25     2018 Treble Choir       ...           Brownsville
26     2018 Treble Choir       ...               Houston
27     2018 Treble Choir       ...            Carrollton
28     2018 Treble Choir       ...                 Plano
29     2018 Treble Choir       ...               Helotes
..                   ...       ...                   ...
140    2018 Treble Choir       ...                Austin
141    2018 Treble Choir       ...                 Hurst
142    2018 Treble Choir       ...           League City
143    2018 Treble Choir       ...                Odessa
144    2018 Treble Choir       ...                 Heath
145    2018 Treble Choir       ...            Cedar Park
146    2018 Treble Choir       ...        Jersey Village
147    2018 Treble Choir       ...             Harlingen
148    2018 Treble Choir       ...         Grand Prairie
149    2018 Treble Choir       ...               Coppell
150    2018 Treble Choir       ...               Lubbock
151    2018 Treble Choir       ...         The Woodlands
152    2018 Treble Choir       ...                Laredo
153    2018 Treble Choir       ...                Sachse
154    2018 Treble Choir       ...              Pearland
155    2018 Treble Choir       ...           San Antonio
156    2018 Treble Choir       ...                Conroe
157    2018 Treble Choir       ...                Dallas
158    2018 Treble Choir       ...             Arlington
159    2018 Treble Choir       ...              Pearland
160    2018 Treble Choir       ...                 Klein
161    2018 Treble Choir       ...               Houston
162    2018 Treble Choir       ...                Keller
163    2018 Treble Choir       ...               Houston
164    2018 Treble Choir       ...            Fort Worth
165    2018 Treble Choir       ...                Humble
166    2018 Treble Choir       ...             Deer Park
167    2018 Treble Choir       ...               Houston
168    2018 Treble Choir       ...              Magnolia
169    2018 Treble Choir       ...                  Katy

[170 rows x 5 columns]

相关问题 更多 >

    热门问题