使用BeautifulSoup在选择按钮后提取“Table1”?

2024-09-30 02:14:49 发布

您现在位置:Python中文网/ 问答频道 /正文

在选择“HK Stock”和“Show All”按钮后,我尝试在“https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng”中下载该表。我检查了Chrome/Inspect/Network功能。没有向服务器发送新数据的请求。因此,我怀疑数据在原始页面中。按下“全部显示”按钮后,我检查了它是否出现在“表1”中。我尝试了以下代码,但没有结果,请告知:

url="https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng"
result = requests.get(url)
result.raise_for_status()
result.encoding = "utf-8"

src = result.content
soup = BeautifulSoup(src, 'lxml')
table = soup.findAll("Table1")
output_rows = []
for table_row in table.findAll('tr'):
    columns = table_row.findAll('td')
    output_row = []
    for column in columns:
        output_row.append(column.text)
    output_rows.append(output_row)

print(output_rows)

Tags: httpscomforoutputwwwtableresult按钮
1条回答
网友
1楼 · 发布于 2024-09-30 02:14:49

要获取数据,必须使用正确的参数发出POST请求

例如:

import requests
from bs4 import BeautifulSoup

url = 'https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng'

with requests.session() as s:
    soup = BeautifulSoup(s.get(url).text, 'html.parser')

    data = {i['name']: i['value'] if 'value' in i.attrs else '' for i in soup.select('input[name]')}
    del data['StockMarginRatioGrid$btnFind']
    data['StockMarginRatioGrid$txtExchange'] = 'HKEX'

    soup = BeautifulSoup(s.post(url, data=data).text, 'html.parser')

    for tr in soup.select('#StockMarginRatioGrid_gridResult tr'):
        print(''.join('{:^21}'.format(td.text) for td in tr.select('td')))

印刷品:

 Stock Code              Name          Stock Margin Ratio      Deposit Ratio                              Stock Code              Name          Stock Margin Ratio      Deposit Ratio    
      1               CKHHOLDINGS              85%                  15%                                        2               CLPHOLDINGS              85%                  15%         
      3               HK&CHINAGAS              85%                  15%                                        4              WHARFHOLDINGS             82%                  18%         
      5              HSBCHOLDINGS              85%                  15%                                        6               POWERASSETS              85%                  15%         
      8                  PCCW                  75%                  25%                                       10              HANGLUNGGROUP             75%                  25%         
     11              HANGSENGBANK              85%                  15%                                       12              HENDERSONLAND             85%                  15%         
     14                HYSANDEV                75%                  25%                                       16                 SHKPPT                 85%                  15%         
     17               NEWWORLDDEV              85%                  15%                                       18              ORIENTALPRESS             20%                  80%         
     19              SWIREPACIFICA             85%                  15%                                       20                WHEELOCK                82%                  18%         
     23               BANKOFEASIA              75%                  25%                                       25             CHEVALIERINT'L             40%                  60%         

... and so on.

编辑:要写入CSV文件,可以使用以下示例:

import csv
import requests
from bs4 import BeautifulSoup

url = 'https://www.bsgroup.com.hk/BrightSmart/MarginRatio/StockMarginRatioEnquiry.aspx?Lang=eng'

with requests.session() as s, open('output.csv', 'w') as f_out:
    writer = csv.writer(f_out)

    soup = BeautifulSoup(s.get(url).text, 'html.parser')

    data = {i['name']: i['value'] if 'value' in i.attrs else '' for i in soup.select('input[name]')}
    del data['StockMarginRatioGrid$btnFind']
    data['StockMarginRatioGrid$txtExchange'] = 'HKEX'

    soup = BeautifulSoup(s.post(url, data=data).text, 'html.parser')

    for tr in soup.select('#StockMarginRatioGrid_gridResult tr'):
        writer.writerow([td.text.strip() for td in tr.select('td')])

相关问题 更多 >

    热门问题