How to give delay between each requests in scrapy?

Question

I don't want to crawl simultaneously and get blocked. I would like to send one request per second.

[Here](http://stackoverflow.com/questions/30404364/scrapy-delay-request) you have an explicit solution. — Godoy, Jun 15 '16 at 17:56

score 82 · Accepted Answer · edited May 02 '17 at 18:21

82

There is a setting for that:

DOWNLOAD_DELAY

Default: 0

The amount of time (in secs) that the downloader should wait before downloading consecutive pages from the same website. This can be used to throttle the crawling speed to avoid hitting servers too hard.

DOWNLOAD_DELAY = 0.25    # 250 ms of delay

Read the docs: https://doc.scrapy.org/en/latest/index.html

edited May 02 '17 at 18:21

Done Data Solutions

2,156
19
32

answered Jan 07 '12 at 20:41

warvariuc

57,116
41
173
227

7

If you put `DOWNLOAD_DELAY=1`, I don't think you can get 60 pages in one minute. It is also restrained by the downloading speed and all kinds of overhead. I would say it only give you a scraping upper limit so you don't hit target sites too much. – B.Mr.W. Aug 27 '14 at 18:46

score 20 · Answer 2 · answered Jun 03 '13 at 12:28

20

You can also set 'download_delay' attribute on spider if you don't want a global download delay. See http://doc.scrapy.org/en/latest/faq.html#what-does-the-response-status-code-999-means

answered Jun 03 '13 at 12:28

Mikhail Korobov

21,908
8
73
65

1

Download delay per spider see also https://docs.scrapy.org/en/latest/topics/settings.html – Julian Solarte Jan 11 '20 at 22:00

score 11 · Answer 3 · edited Mar 16 '15 at 09:45

11

class S(Spider):
    rate = 1

    def __init__(self):
        self.download_delay = 1/float(self.rate)

rate sets a maximum amount of pages could be downloaded in one second.

edited Mar 16 '15 at 09:45

Anton Kolenkov

310
4
16

answered Aug 20 '14 at 03:20

Yan.Zero

449
4
12

Can you please add a description or some explaination of what this does. As it stands I have to vote this answer for deletion. – Numeron Aug 20 '14 at 03:48
@AndréYuhai, I can't find official doc for that but in the source. In `_get_concurrency_delay` function from `scrapy/core/downloader/__init__.py`. – PaleNeutron Jun 13 '21 at 15:19

score 9 · Answer 4 · answered Jan 19 '17 at 13:39

9

Beside DOWNLOAD_DELAY, you can also use AUTOTHROTTLE feature of scrapy, https://doc.scrapy.org/en/latest/topics/autothrottle.html

It changes delay amount between requests depending on settings file. If you set 1 for both start and max delay, it will wait 1 second in each request.

It's original purpose is to vary delay time so detection of your bot will be harder.

You just need to set it in settings.py as follows:

AUTOTHROTTLE_ENABLED = True
AUTOTHROTTLE_START_DELAY = 1
AUTOTHROTTLE_MAX_DELAY = 3

answered Jan 19 '17 at 13:39

Mehmet Kurtipek

415
6
13

didn't even know about this until now. can those go in the `custom_settings` dict? thanks! – oldboy Jul 04 '18 at 20:14
I believe that will work as well. As custom_settings can overwrite general settings per spider. – Mehmet Kurtipek Jul 05 '18 at 09:20

score 7 · Answer 5 · answered Oct 14 '15 at 04:17

Delays Can we set in 2 says:-

We can specify the delay while running the crawler. Eg. scrapy crawl sample --set DOWNLOAD_DELAY=3 ( which means 3 seconds delay between two requests)

Or else we can specify Globaly in the settings.py DOWNLOAD_DELAY = 3

by default scrapy takes 0.25 seconds delay between 2 requests.

Jeff P Chacko · Answer 6 · 2015-10-14T15:00:55.033

if you want to keep a download delay of exactly one second, setting DOWNLOAD_DELAY=1 is the way to do it.

But scrapy also has a feature to automatically set download delays called AutoThrottle. It automatically sets delays based on load of both the Scrapy server and the website you are crawling. This works better than setting an arbitrary delay.

Read further about this on http://doc.scrapy.org/en/1.0/topics/autothrottle.html#autothrottle-extension
I've crawled more than 100 domains and not been blocked with AutoThrottle turned on

How to give delay between each requests in scrapy?

6 Answers6

Linked