创建一个脚本，从网页上的文本下载值

import urllib2, csv url="http://forecast.weather.gov/product.php? site=JAN&issuedby=ORD&product=CLI&format=CI&version=5&glossary=0" downloaded_data = urllib2.urlopen(url) #csv_data = csv.reader(downloaded_data) row2 = '' for row in downloaded_data: row2 = row2 + row start = row2.find('HIGHEST GUST SPEED ') + 21 end = row2.find('HIGHEST GUST DIRECTION', start) print int(row2[start:end])

1条回答

网友

1楼 · 发布于 2024-05-11 06:13:04

听起来像是你想刮网站。在这种情况下，我将使用python的urllib和漂亮的soup lib。你知道吗

编辑：

我只是看看你的链接，我不认为美丽的汤真的是在这种情况下的问题。我仍然会使用urllib，但是一旦你得到了这个对象，你就必须解析这些数据来寻找你需要的东西。这有点麻烦，但应该管用。我得回去看看事情是怎么发生的。你知道吗

但是，您可以使用BeautifulSoup来提取纯文本，从而使纯文本解析更容易一些？。总之，只是一个想法！你知道吗

一旦你得到了这些数据，你就可以创建任何逻辑来检查上一次的值是否大于上一次的值。一旦你找到了那部分，就出去获取数据。只需创建一个init.d脚本，然后忘掉它。你知道吗

# example urllib 
def requesturl(self, url):
    f = urllib.urlopen(url)
    html = f.read()
    return html

 # beautiful soup
def beautifyhtml(self, html):
    currentprice_id = 'yfs_l84_' + self.s.lower()
    current_change_id = 'yfs_c63_' + self.s.lower()
    current_percent_change_id = 'yfs_p43_' + self.s.lower()
    find = []
    find.append(currentprice_id)
    find.append(current_change_id)
    find.append(current_percent_change_id)
    soup = BeautifulSoup(html)
    # title of the sites - has stock quote
    #title = soup.title.string
    #print(title)
    # p is where the guts of the information I would want to get
    #soup.find_all('p')
    color = soup.find_all('span', id=current_change_id)[0].img['alt']    
    # drilled down version to get current price:
    found = []
    for item in find:
        found.append(soup.find_all('span', id=item)[0].string)
    found.insert(0, self.s.upper())
    found.append(color)
    return found

相关问题更多 >

编程相关推荐

热门问题

热门文章