Implement queues to manage competition between spiders in Scrapyd

2
Is there any way for Scrapyd to create queues of spiders so that when I send many spiders (with different functions) I can privilege / limit the competition between them? Today, all the Spiders I send execute in the order set by the Scrapyd server.

    
asked by anonymous 09.01.2015 / 21:08

1 answer

1

Well, if you need simple priorities, one option is to use the scrapyd priorities parameter (this is not documented but is implemented here , it's basically a basic priority queue on top of Sqlite.)

To use, just pass the argument priority=NUMERO when calling the API /schedule.json . The default value is 0 , use a higher value for higher priority.

If you need some more complex queue schema, you may have to deploy your own solution. Or use the Scrapy Cloud from Scrapinghub < and crawl using the rows from the Hub Crawl Frontier .

[*] for full transparency: work at Scrapinghub

    
11.01.2015 / 05:28