Python/bs4:尝试从本地网站打印温度/城市

THIS IS THE HTML (the part of matters:) #<div class="ca-cidade"><a href="/site/internas/conteudo/meteorologia/grafico.shtml?id=23185109">Londrina</a></div> <ul class="ca-condicoes"> <li class="ca-cond-firs"><img src="/site/imagens/icones_condicoes/temperatura/temp_baixa.png" title="Temperatura em declínio"/><br/>23.1°C</li> <li class="ca-cond"><img src="/site/imagens/icones_condicoes/vento/L.png"/><br/>10 km/h</li> <li class="ca-cond"><div class="ur">UR</div><br/>54%</li> <li class="ca-cond"><img src="/site/imagens/icones_condicoes/chuva.png"/><br/>0.0 mm</li>

3条回答

网友

1楼 · 编辑于 2024-09-28 22:09:07

from bs4 import BeautifulSoup
import requests

URL = 'http://www.simepar.br/site/index.shtml'

rawhtml = requests.get(URL).text
soup = BeautifulSoup(rawhtml, 'html.parser') # parse page as html

temp_table = soup.find_all('table', {'class':'cidadeTempo'}) # get detail of table with class name cidadeTempo
for entity in temp_table:
    city_name = entity.find('h3').text # fetches name of city
    city_temp_max = entity.find('span', {'class':'tempMax'}).text # fetches max temperature
    city_temp_min = entity.find('span', {'class':'tempMin'}).text # fetches min temperature
    print("City :{} \t Max_temp: {} \t Min_temp: {}".format(city_name, city_temp_max, city_temp_min)) # prints content

下面的代码可以得到详细的温度在页面的右侧，如您所需。在

^{pr2}$

网友

2楼 · 编辑于 2024-09-28 22:09:07

你应该试试BS4中的CSS3选择器，我个人觉得它比find and find-all要容易得多。在

from bs4 import BeautifulSoup
import requests

URL = 'http://www.simepar.br/site/index.shtml'

rawhtml = requests.get(URL).text
soup = BeautifulSoup(rawhtml, 'lxml')

# soup.select returns the list of all the elements that matches the CSS3 selector

# get the text inside each <a> tag inside div.ca-cidade
cities = [cityTag.text for cityTag in soup.select("div.ca-cidade > a")] 

# get the temperature inside each li.ca-cond-firs
temps = [tempTag.text for tempTag in soup.select("li.ca-cond-firs")]

# get the temperature status inside each li.ca-cond-firs > img title attibute
tempStatus = [tag["title"] for tag in soup.select("li.ca-cond-firs > img")]

# len(cities) == len(temps) == len(tempStatus) => This is normally true.

for i in range(len(cities)):
    print("City: {}, Temperature: {}, Status: {}.".format(cities[i], temps[i], tempStatus[i]))

网友

3楼 · 编辑于 2024-09-28 22:09:07

我不知道你的代码遇到了什么问题。在我尝试使用您的代码时，我发现我需要使用html解析器来成功解析网站。我也用过芬德尔汤（）以查找与所需类匹配的元素。希望下面的内容能让你找到答案：

from bs4 import BeautifulSoup
import requests

URL = 'http://www.simepar.br/site/index.shtml'

rawhtml = requests.get(URL).text
soup = BeautifulSoup(rawhtml, 'html.parser')

rows = soup.findAll('li', {'class', 'ca-cond-firs'})
print rows

这是我目前所做的代码：

相关问题更多 >

编程相关推荐

热门问题

热门文章