Deploying Spiders
This section describes the different options you have for deploying your Scrapyspiders to run them on a regular basis. Running Scrapy spiders in your localmachine is very convenient for the (early) development stage, but not so muchwhen you need to execute long-running spiders or move spiders to run inproduction continuously. This is where the solutions for deploying Scrapyspiders come in.
Popular choices for deploying Scrapy spiders are:
- Scrapyd (open source)
- Scrapy Cloud (cloud-based)
Deploying to a Scrapyd Server
Scrapyd is an open source application to run Scrapy spiders. It providesa server with HTTP API, capable of running and monitoring Scrapy spiders.
To deploy spiders to Scrapyd, you can use the scrapyd-deploy tool provided bythe scrapyd-client package. Please refer to the scrapyd-deploydocumentation for more information.
Scrapyd is maintained by some of the Scrapy developers.
Deploying to Scrapy Cloud
Scrapy Cloud is a hosted, cloud-based service by Scrapinghub,the company behind Scrapy.
Scrapy Cloud removes the need to setup and monitor serversand provides a nice UI to manage spiders and review scraped items,logs and stats.
To deploy spiders to Scrapy Cloud you can use the shub command line tool.Please refer to the Scrapy Cloud documentation for more information.
Scrapy Cloud is compatible with Scrapyd and one can switch betweenthem as needed - the configuration is read from the scrapy.cfg
filejust like scrapyd-deploy
.