<p>如果我从有效负载中删除<code>dnn$ctr410$MemberSearch$grdMembers$ctl00$ctl02$ctl01$ctl02</code>等箭头按钮键,它就会开始工作</p>
<pre><code> name_length = len('dnn$ctr410$MemberSearch$grdMembers$ctl00$ctl02$ctl01$ctl02')
for key in list(payload.keys()):
if key.startswith('dnn') and len(key) == name_length:
payload.pop(key)
print(key)
</code></pre>
<p>但您可以使用来自<code>αԋɱҽԃ αмєяιcαη</code>答案的方法来确保只发送所需的值</p>
<hr/>
<pre><code>import requests
from bs4 import BeautifulSoup
link = 'https://www.icsi.in/student/Members/MemberSearch.aspx?SkinSrc=%5BG%5DSkins/IcsiTheme/IcsiIn-Bare&ContainerSrc=%5BG%5DContainers/IcsiTheme/NoContainer'
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
r = s.get(link)
soup = BeautifulSoup(r.text,"lxml")
payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
payload['__EVENTTARGET'] = 'dnn$ctr410$MemberSearch$btnSearch'
page = 5
while True:
r = s.post(link, data=payload)
soup = BeautifulSoup(r.text, "lxml")
for item in soup.select("span[id$='_lblFullName']"):
print(item.text)
page += 2
payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
payload['__EVENTTARGET'] = 'dnn$ctr410$MemberSearch$grdMembers$ctl00$ctl02$ctl01$ctl{:02}'.format(page)
name_length = len('dnn$ctr410$MemberSearch$grdMembers$ctl00$ctl02$ctl01$ctl02')
for key in list(payload.keys()):
if key.startswith('dnn') and len(key) == name_length:
payload.pop(key)
print(key)
payload['__dnnVariable'] = {'__scdoff':'1','__dnn_pageload':'__dnn_setScrollTop();'}
payload['ScrollTop'] = '400'
</code></pre>
<hr/>
<p><strong>编辑:</strong>页面使用更复杂的系统,10页后显示新链接,但带有旧值<code>ctl07</code>,<code>ctl09</code>。我使用带有箭头的“从按钮到下一页”的“名称”来代替此链接-开始时,它的值为<code>ctrl28</code>,但在10页之后,它的值为<code>ctrl30</code>(因为有更多的链接-它将链接<code>...</code>添加到下一页/上一页的10页列表中)</p>
<pre><code>import requests
from bs4 import BeautifulSoup
link = 'https://www.icsi.in/student/Members/MemberSearch.aspx?SkinSrc=%5BG%5DSkins/IcsiTheme/IcsiIn-Bare&ContainerSrc=%5BG%5DContainers/IcsiTheme/NoContainer'
with requests.Session() as s:
s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'
r = s.get(link)
soup = BeautifulSoup(r.text,"lxml")
payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
payload['__EVENTTARGET'] = 'dnn$ctr410$MemberSearch$btnSearch'
page = 1 # I don't need it to generate lins, now I use it only to display page number
while True:
print('page:', page)
page += 1
r = s.post(link, data=payload)
soup = BeautifulSoup(r.text, "lxml")
for item in soup.select("span[id$='_lblFullName']"):
print(item.text)
payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
name_length = len('dnn$ctr410$MemberSearch$grdMembers$ctl00$ctl02$ctl01$ctl28')
for key in list(payload.keys()):
if key.startswith('dnn') and len(key) == name_length:
payload.pop(key)
#print(key)
# button with arrow to next page
next_page = soup.select("input[class='rgPageNext']")
if not next_page:
break
next_page = next_page[0]['name']
print(next_page)
payload[next_page] = ''
payload['__dnnVariable'] = {'__scdoff':'1','__dnn_pageload':'__dnn_setScrollTop();'}
payload['ScrollTop'] = '400'
</code></pre>