擅长:python、mysql、java
<p>假设您从第一个spider获得csv格式的输出,下面的代码将逐行读取该文件,并使用xpath将其擦除。在</p>
<pre><code>class Stage2Spider(scrapy.Spider):
name = 'stage2'
allowed_domains = []
start_urls = []
read_urls = open('collecturls.csv', 'r')
for url in read_urls.readlines():
url = url.strip()
allowed_domains = allowed_domains + [url[4:]]
start_urls = start_urls + [url]
read_urls.close()
</code></pre>
<p>希望有帮助。在</p>