所以我试图从Amazon页面中获取数据,当我试图解析卖家所在的位置时,遇到了一个错误。我的代码是:
#getting the html
request = urllib2.Request('http://www.amazon.com/gp/offer-listing/0393934241/')
opener = urllib2.build_opener()
#hiding that I'm a webscraper
request.add_header('User-Agent', 'Mozilla/5 (Solaris 10) Gecko')
#opening it up, putting into soup form
html = opener.open(request).read()
soup = BeautifulSoup(html, "html5lib")
#parsing for the seller info
sellers = soup.findAll('div', {'class' : 'a-row a-spacing-medium olpOffer'})
for eachseller in sellers:
#parsing for price
price = eachseller.find('span', {'class' : 'a-size-large a-color-price olpOfferPrice a-text-bold'})
#parsing for shipping costs
shippingprice = eachseller.find('span'
, {'class' : 'olpShippingPrice'})
#parsing for condition
condition = eachseller.find('span', {'class' : 'a-size-medium'})
#parsing for seller name
sellername = eachseller.find('b')
#parsing for seller location
location = eachseller.find('div', {'class' : 'olpAvailability'})
#printing it all out
print "price, " + price.string + ", shipping price, " + shippingprice.string + ", condition," + condition.string + ", seller name, " + sellername.string + ", location, " + location.string
我收到错误消息,与末尾的“print”命令有关:
TypeError: coercing to Unicode: need string or buffer, NoneType found
我知道它来自这一行-location = eachseller.find('div', {'class' : 'olpAvailability'})
-因为没有那行代码可以正常工作,我知道我得到了NoneType,因为这行没有找到任何东西。以下是我要解析的部分中的html:
我不明白“位置”这行代码有什么问题,也不知道为什么它不能提取我想要的数据。在
编辑:我想出来了,但我不知道为什么。如果将打印命令更改为 打印位置。查找(文本=真) 它输出我想要的位置。希望有一天这会对某人有所帮助。在
好像你找错了类名
更改代码中的以下行:
^{pr2}$相关问题 更多 >
编程相关推荐