10

Are there any general rules on when a website sends out a TCP reset, triggering the Connection reset by peer error?

Like

  • too many open connections
  • too high bandwidth use
  • connected for too long
  • …?

I'm pretty certain that there is no law governing this and that different websites/web developers have different tastes, but I would be interested if there are some general rule sets (from websites or textbooks on the subject or what you have been taught in school/at work) that are mostly followed.

Reason why I'm asking, of course, is that I want to get around being blocked…

I'm downloading some government data that is freely available, but is lacking an API or something, so the two official ways to get it are either clicking around in some web-GIS a few thousand times or going along the Kafkaesque path of explaining various levels of clerks the concepts of databases, csv files, zip files and that you can't (and won't need to, if they'd just did what you try to explain them) just drive to their agency with a "giant" harddrive, so I'm trying to just go the most resource saving way for everyone involved…

JC_CL
  • 2,346
  • 6
  • 23
  • 36

1 Answers1

9

A website is not "sending" a "Connection reset by peer" error. This error is generated by the OS kernel on the client site if it gets a TCP reset for an active connection. There are many reasons this TCP reset might be sent. A TCP reset might be sent by design from some kind of load limit, for example to limit the number of connections from the same IP address within a specific time as a form of DOS protection, to restrict data scraping or to enforce some kind of fair use. There is no general rule or even law for this kind of explicit limits.

A TCP reset might also be caused by the application being overloaded, application crashing, system running out of resources ... .

And a TCP reset will happen if the client writes to a connection which the server already considers as closed. This can happen for example with HTTP keep alive: the server might close the connection on inactivity at any time after the HTTP response was sent. If the client sends a new request on the same connection at the same time the server closes the connection, the server will reject the new request (since the connection is closed on the server end) and will send a TCP RST, causing a connection reset by peer at the client. The client needs to properly handle this situation by creating a new connection and sending the request again (provided that the request was not state changing, i.e. is idempotent).

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • Seems like I got the background wrong. I'll edit my question. I guess you already covered parts of the reasons, but I'd be interested if there's any hard guidelines for that. – JC_CL Jul 02 '18 at 14:15
  • Thanks, that's very helpful! But I'd love to have some resources to learn more on the subject, if those exist, so some more answers could be interesting. – JC_CL Jul 03 '18 at 06:42
  • 1
    @JC_CL: a [simple search](https://www.google.com/search?q=when+does+a+tcp+connection+reset+happen) shows for example the highly voted question [What causes a TCP/IP reset (RST) flag to be sent?](https://stackoverflow.com/questions/251243/what-causes-a-tcp-ip-reset-rst-flag-to-be-sent). – Steffen Ullrich Jul 03 '18 at 07:29