从for循环将结果保存到列表?

2024-09-30 00:23:23 发布

您现在位置:Python中文网/ 问答频道 /正文

url = 'http://www.millercenter.org/president/speeches'

conn = urllib2.urlopen(url)
html = conn.read()


miller_center_soup = BeautifulSoup(html)
links = miller_center_soup.find_all('a')

for tag in links:
    link = tag.get('href',None)
        if link is not None:
            print link

以下是我的一些输出:

/president/washington/speeches/speech-3939
/president/washington/speeches/speech-3939
/president/washington/speeches/speech-3461
https://www.facebook.com/millercenter
https://twitter.com/miller_center
https://www.flickr.com/photos/miller_center
https://www.youtube.com/user/MCamericanpresident
http://forms.hoosonline.virginia.edu/s/1535/16-uva/index.aspx?sid=1535&gid=16&pgid=9982&cid=17637
mailto:mcpa-webmaster@virginia.edu

我正试图在网站millercenter.org/president/speeches上通过网络搜集所有总统演讲稿,但很难保存相应的演讲链接,我将从中搜集演讲数据。更明确地说,假设我需要georgewashington的演讲,可以在http://www.millercenter.org/president/washington/speeches/speech-3461访问-我只需要能够访问该url。我正在考虑将所有演讲的url存储在一个列表中,然后编写一个for循环来清除所有数据。你知道吗


Tags: httpsorgcomhttpurlwwwlinkconn
2条回答

将其转换为列表:

linklist = [tag.get('href') for tag in links if tag.get('href') is not None]

略微优化:

linklist = [href for href in (tag.get('href') for tag in links) if href is not None]

如果您对列表理解不满意或不想使用它,您可以创建一个列表并附加到它:

all_links = []
for tag in links:
    link = tag.get('href',None)
        if link is not None:
            all_links.append(link)

相关问题 更多 >

    热门问题