Scrapy and captch问题的回答

Scrapy and captch

回答此问题可获得 20 贡献值，回答如果被采纳可获得 50 分。

我在站点<a href="https://www.barefootstudent.com/jobs" rel="nofollow">https://www.barefootstudent.com/jobs</a>（任何指向页面的链接<a href="http://www.barefootstudent.com/los_angeles/jobs/full_time/full_time_nanny_needed_in_venice_217021" rel="nofollow">http://www.barefootstudent.com/los_angeles/jobs/full_time/full_time_nanny_needed_in_venice_217021</a>）中使用scrapy提交表单 我的scapy机器人成功登录，但我无法避免验证码。表格提交我使用scrapy.FormRequest.from_response在 <pre><code>frq = scrapy.FormRequest.from_response(response, formdata={'message': 'itttttttt', 'security': captcha, 'name': 'fx', 'category_id': '2', 'email': 'ololo%40gmail.com', 'item_id': '216640_2', 'location': '18', 'send_message': 'Send%20Message' }, callback=self.afterForm) yield frq </code></pre> 我想从这个页面加载验证码图像，并手动输入到脚本运行时。等等 ^{pr2}$ 我尽力了 <pre><code> urllib.urlretrieve(captcha, "./captcha.jpg") </code></pre> 但是这个方法加载错误的验证码（网站拒绝我的输入）。我试着打电话urllib.urlretieve在一个运行脚本中重复，每次他返回不同的验证码：（ 之后，我尝试使用ImagePipeline。但我的问题是返回项（下载图像）只有在函数完成执行之后才会发生，即使我使用的是yeld。在 <pre><code> item = BfsItem() item['image_urls'] = [captcha] yield item captcha = raw_input("put captcha in manually>") frq = scrapy.FormRequest.from_response(response, formdata={'message': 'itttttttt', 'security': captcha, 'name': 'fx', 'category_id': '2', 'email': 'ololo%40gmail.com', 'item_id': '216640_2', 'location': '18', 'send_message': 'Send%20Message' }, callback=self.afterForm) yield frq </code></pre> 那一刻，当我的脚本请求输入时，图片是不下载的！在 如何修改我的脚本，并可以调用FormRequest后手动输入验证码？在 非常感谢！在

0 条评论
分类：Python问答

默认排序时间排序

1 个回答

匿名 1天前

　擅长：python、mysql、java

Scrapy and captch

1 个回答

相关Python问题