为什么输出是错的?

2024-06-26 10:35:19 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试将a附加到results,它应该打印普通的http://链接。我希望能够打印出这样的结果:results[:4] 我很感谢你的帮助!谢谢!你知道吗

代码如下:

from mechanize import Browser
from BeautifulSoup import BeautifulSoup

results = []

def extract(soup):
 section = soup.find('section', {'class' : 'content left'})
 for post in section.findAll('article'):
   header = post.find('header', {'class' : 'loop-data'}) 
   a = header.findAll('a', href=True)
   for x in a:
    results.append(x.get('href'))
 print results

br = Browser()
url = "http://www.hotglobalnews.com/category/politics/"
page1 = br.open(url)
html1 = page1.read()
soup1 = BeautifulSoup(html1)
extract(soup1)

这是我的结果:

[u'http://www.hotglobalnews.com/canada-just-legalized-heroin-to-control-     drug-addiction/', u'http://www.hotglobalnews.com/justin-trudeau-announces-deal-with-uber-uberweed/', u'http://www.hotglobalnews.com/donald-trump-to-legalize-marijuana-in-all-50-states/', u'http://www.hotglobalnews.com/obama-to-create-law-banning-words/', u'http://www.hotglobalnews.com/trudeau-says-trump-is-a-racist-bastard/', u'http://www.hotglobalnews.com/donald-trump-to-build-replica-of-guantanamo-bay-for-mexicans/', u'http://www.hotglobalnews.com/donald-trump-to-legalize-incest-marriages-if-elected/', u'http://www.hotglobalnews.com/justin-trudeau-to-build-statue-of-trudeau-in-2017/', u'http://www.hotglobalnews.com/donald-trump-muslims-invented-global-warming-to-destroy-u-s-economy/', u'http://www.hotglobalnews.com/isis-member-found-disguised-as-syrian-refugee-in-canada/', u'http://www.hotglobalnews.com/donald-trump-says-he-is-more-influential-than-martin-luther-king-jr/', u'http://www.hotglobalnews.com/obama-wears-fuck-trump-tshirt-to-white-house-barbecue/', u'http://www.hotglobalnews.com/donald-trump-says-he-could-shoot-somebody/', u'http://www.hotglobalnews.com/donald-trump-says-black-history-month-is-too-long/', u'http://www.hotglobalnews.com/justin-trudeau-to-ban-uber-in-canada/', u'http://www.hotglobalnews.com/justin-trudeau-accepts-comedy-central-new-years-roast/', u'http://www.hotglobalnews.com/donald-trumps-muslim-comment-disqualifies-him-from-presidency/', u'http://www.hotglobalnews.com/paris-terrorist-spotted-live-on-news-after-terror-attacks-on-paris/', u'http://www.hotglobalnews.com/anonymus-hacker-collective-declares-war-on-islamic-sate-group/', u'http://www.hotglobalnews.com/paris-attacks-over-100-killed-in-gunfire-and-blasts2/']

Tags: toinfromcomhttpwwwsectionresults
1条回答
网友
1楼 · 发布于 2024-06-26 10:35:19

你的单子没问题。u符号告诉您字符串中的内容是Unicode,但这在任何方面都不是“错误的”。打印字符串将产生所需的结果(前提是您的操作系统已正确配置为显示字符;对于看起来基本上是普通ASCII字符串的内容,这应该不是问题)。你知道吗

python3稍微改变了这些东西,但通常是为了更好。您仍然需要理解字节字符串和Unicode字符串之间的区别(至少如果您也需要处理字节字符串的话),但是默认情况下,所有字符串都是Unicode,这在当今时代很有意义。你知道吗

https://nedbatchelder.com/text/unipain.html仍然是一个很好的起点,特别是如果您还没有过渡到python3。你知道吗

相关问题 更多 >