从web站点获取文本并显示它

2条回答

网友

1楼 · 编辑于 2024-09-30 22:27:19

查看^{}从url获取html，查看^{}/^{}/etc解析html。然后，您可以使用以下内容作为脚本的起点：

import time
import urllib2
import BeautifulSoup
import HTMLParser

def getSource(url, postdata):
    source = ""
    req = urllib2.Request(url, postdata)
    try:
        sock = urllib2.urlopen(req)
    except urllib2.URLError, exc:
        # handle the error..
        pass
    else:
        source = sock.read()
    finally:
        try:
            sock.close()
        except:
            pass
    return source

def parseSource(source):
    pass
    # parse source with BeautifulSoup/HTMLParser, or  here...

def main():
    last_run = 0
    while True:
        t1 = time.time()
        # check if 1 hour has passed since last_run
        if t1 - last_run >= 3600:
            source = getSource("someurl.com", "user=me&blah=foo")
            last_run = time.time()
            parseSource(source)
        else:
            # sleep for 60 seconds and check time again.
            time.sleep(60)
     return 0

if __name__ == "__main__":
    sys.exit(main())

这是一篇关于parsing-html-with-python的好文章

网友

2楼 · 编辑于 2024-09-30 22:27:19

我有一些和你相似的东西，但你遗漏了我的主要问题。我查看了htmlparser和bs，但我不确定如何做类似if（$posttext==gold）echo“gold in so and so”这样的事情。。似乎bs处理了很多标签..我想既然facebook的帖子可以使用各种标签，我该如何在文本上进行搜索并返回“post”？？你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章

从web站点获取文本并显示它

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >