擅长:python、mysql、java
<p>首先,我只得到<code>a</code>(没有<code>text()</code>和<code>extract()</code>),然后我将使用<code>for</code>将<code>text()</code>和{<cd3>}与every<code>a</code>分开使用,而<code>join()</code>将元素连接到带标题的字符串中。在</p>
<pre><code>import scrapy
class MySpider(scrapy.Spider):
name = 'myspider'
start_urls = ['https://www.indeed.cl/trabajo?q=Data%20scientist&l=']
def parse(self, response):
print('url:', response.url)
results = response.xpath('//h2[@class="jobtitle"]/a')
print('number:', len(results))
for item in results:
title = ''.join(item.xpath('.//text()').extract())
print('title:', title)
# - it runs without project and saves in `output.csv` -
from scrapy.crawler import CrawlerProcess
c = CrawlerProcess({
'USER_AGENT': 'Mozilla/5.0',
})
c.crawl(MySpider)
c.start()
</code></pre>
<p>结果:</p>
^{pr2}$