- python-如何解决scarpy-redis空跑问题?
-
聚码交流
-
0
0

- yhuxAvNbtxUM
0000-00-00 00:00:00
- 回帖
scrapy-redis框架中,reids存储的xxx:requests已经爬取完毕,但程序仍然一直运行,如何自动停止程序,而不是一直在空跑?2017-07-0309:17:06[scrapy.extensions.logstats]INFO:Crawled0pages(at0pages/min),scraped0items(at0items/min)2017-07-0309:18:06[scrapy.extensions.logstats]INFO:Crawled0pages(at0pages/min),scraped0items(at0items/min)可以通过engine.close_spider(spider,'reason')来停止程序的运行。defnext_request(self):block_pop_timeout=self.idle_before_closerequest=self.queue.pop(block_pop_timeout)ifrequestandself.stats:self.stats.inc_value('scheduler/dequeued/redis',spider=self.spider)ifrequestisNone:self.spider.crawler.engine.close_spider(self.spider,'queueisempty')returnrequest还有一个问题不明白:当通过engine.close_spider(spider,'reason')来关闭spider时,会出现几个错误之后才能关闭。#正常关闭2017-07-0318:02:38[scrapy.core.engine]INFO:Closingspider(queueisempty)2017-07-0318:02:38[scrapy.statscollectors]INFO:DumpingScrapystats:{'finish_reason':'queueisempty','finish_time':datetime.datetime(2017,7,3,10,2,38,616021),'log_count/INFO':8,'start_time':datetime.datetime(2017,7,3,10,2,38,600382)}2017-07-0318:02:38[scrapy.core.engine]INFO:Spiderclosed(queueisempty)#之后还会出现几个错误才关闭spider,难道spider刚启动时会启动多个线程一起抓取,#然后其中一个线程关闭了spider,其他线程就找不到spider才会报错!UnhandledErrorTraceback(mostrecentcalllast):File"D:/papp/project/launch.py",line37,in<module>process.start()File"D:\ProgramFiles\python3\lib\site-packages\scrapy\crawler.py",line285,instartreactor.run(installSignalHandlers=False)#blockingcallFile"D:\ProgramFiles\python3\lib\site-packages\twisted\internet\base.py",line1243,inrunself.mainLoop()File"D:\ProgramFiles\python3\lib\site-packages\twisted\internet\base.py",line1252,inmainLoopself.runUntilCurrent()---<exceptioncaughthere>---File"D:\ProgramFiles\python3\lib\site-packages\twisted\internet\base.py",line878,inrunUntilCurrentcall.func(*call.args,**call.kw)File"D:\ProgramFiles\python3\lib\site-packages\scrapy\utils\reactor.py",line41,in__call__returnself._func(*self._a,**self._kw)File"D:\ProgramFiles\python3\lib\site-packages\scrapy\core\engine.py",line137,in_next_requestifself.spider_is_idle(spider)andslot.close_if_idle:File"D:\ProgramFiles\python3\lib\site-packages\scrapy\core\engine.py",line189,inspider_is_idleifself.slot.start_requestsisnotNone:builtins.AttributeError:'NoneType'objecthasnoattribute'start_requests'