我在scrapy framework中遇到了这个错误。这是我的dmoz.py在蜘蛛目录下:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from dirbot.items import Website
class DmozSpider(BaseSpider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
f = open("links.csv")
start_urls = [url.strip() for url in f.readlines()]
f.close()
def parse(self, response):
hxs = HtmlXPathSelector(response)
sites = hxs.select('//ul/li')
items = []
for site in sites:
item = Website()
item['name'] = site.select('a/text()').extract()
item['url'] = site.select('a/@href').extract()
item['description'] = site.select('text()').extract()
items.append(item)
return items
运行此代码时遇到此错误:
^{pr2}$以下是我的内容链接.csv公司名称:
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
http://www.atsu.edu/
在中有80个URL链接.csv. 如何解决此错误?在
^{} is ^{} urlencoded 。您的CSV文件可能有如下行:
"
s编辑:按要求:
^{pr2}$编辑2:
上次编辑:
相关问题 更多 >
编程相关推荐