抓取Java网页

from bs4 import BeautifulSoup import requests resp = requests.get("https://www.senamhi.gob.pe/mapas/mapa-estaciones/_dat_esta_tipo.php?estaciones=472CA750") html = resp.content soup = BeautifulSoup(html) option_tags = soup.find_all("option")

1条回答

网友

1楼 · 发布于 2024-09-29 01:25:32

当我查看您给定的url时，我认为该表嵌入了给定的网站：

 <iframe src="_dat_esta_tipo02.php?estaciones=472CA750&tipo=SUT&CBOFiltro=201902&t_e=M" name="contenedor" width="600" marginwidth="0" height="560" marginheight="0" scrolling="NO" align="center"  frameborder="0" id="interior"></iframe>

当您单击srchttps://www.senamhi.gob.pe/mapas/mapa-estaciones/_dat_esta_tipo.php?estaciones=472CA750时，页面将打开并显示相同的表，以便您可以使用soap访问此页面。我给你试试看，结果是真的

**所有代码：**

from bs4 import BeautifulSoup
import requests
resp = requests.get("https://www.senamhi.gob.pe/mapas/mapa- 
estaciones/_dat_esta_tipo02.php? 
estaciones=472CA750&tipo=SUT&CBOFiltro=201902&t_e=M")

html = resp.content
soup = BeautifulSoup(html,"lxml") ## Add lxml  or html.parser in this line

option_tags = soup.find_all("tr" , attrs={'aling' : 'center'})

for a in option_tags:
    print a.find('div').text

输出：

Día/mes/año
Prom
01-02-2019
02-02-2019
03-02-2019
04-02-2019
05-02-2019
06-02-2019
07-02-2019
08-02-2019
09-02-2019
10-02-2019
11-02-2019
12-02-2019
13-02-2019
14-02-2019
15-02-2019
16-02-2019
17-02-2019
18-02-2019

以上代码只获取日期。如果要访问给定日期的所有元素，可以创建一个数组并附加它。只是会改变下面的代码

array = []
for a in option_tags:
    array.append(a.text.split())

print array

相关问题更多 >

编程相关推荐

热门问题

热门文章

抓取Java网页

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >