Why do browsers re-request scripts on non-200 response?

Question

Save the following HTML as a local file. Something like /tmp/foo.html, then open that in Firefox (I'm on 49.0.2)

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
</head>
<body>
<script src="http://localhost:1234/a.js"></script>
<script src="http://localhost:1234/b.js"></script>
<script src="http://localhost:1234/c.js"></script>
<script src="http://localhost:1234/d.js"></script>
<script src="http://localhost:1234/e.js"></script>
</body>
</html>

I don't have a server running on port 1234, so the requests don't even successfully connect.

The behavior I'd expect here is for all the requests to fail, and be done with it.

What actually happens in Firefox is all 5 .js files are requested in parallel, they fail to connect, then the last 4 get re-requested in serial. Like so:

Why?

If I boot a server on 1234 that always 404s, the behaviour is the same.

This particular example doesn't reproduce the same behavior in Chrome, but other similar examples is how I originally fell upon this behavior.

EDIT: Here's how I tested this happens when it 404's as well.

$ cd /tmp
$ mkdir empty
$ cd empty
$ python -m SimpleHTTPServer 1234

Then reloaded Firefox. It shows this:

The server actually sees all those requests too (the first 5 arrive out of order because they're requested in parallel, but the last 4 are always b, c, d, e, since they get re-requested in serial).

127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /d.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /c.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /b.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /a.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /e.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /b.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /c.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /d.js HTTP/1.1" 404 -
127.0.0.1 - - [02/Nov/2016 13:25:40] code 404, message File not found
127.0.0.1 - - [02/Nov/2016 13:25:40] "GET /e.js HTTP/1.1" 404 -

It's trying to see if you REALLY don't have a server running on port 1234 or if there's an intermittent network problem. It really can't magically know that there really is NO server running. — slebetman, Nov 02 '16 at 02:31
@slebetman It does that even if there is a server that responds with 404. Or with 500. — Jamie Wong, Nov 02 '16 at 02:36
I've tried this in Firefox, Chrum, Edge and IE - not one of those browsers tries more than once if the server responds with 404 - so that comment is wrong — Jaromanda X, Nov 02 '16 at 03:03
@JaromandaX Screenshot of proof + repro instructions for it 404'ing — Jamie Wong, Nov 02 '16 at 20:24

score 10 · Accepted Answer · edited May 23 '17 at 10:30

This has to do with edge-cases that could arise with parallel resource loading, where JavaScript is expected to block other resources from loading.

This behavior starts to get more-clear when you add a delay into the error responses. Here is a screenshot of the Firefox network panel with a 1-second delay added to each request.

As we can see, all 5 scripts were requested in parallel, as modern browser do, to reduce loading times.

However, except for the first one, those scripts that returned a 404 were re-requested, not in parallel but in series. This is almost-certainly to maintain backwards compatibility with some edge-cases with the legacy browser behavior.

Historically, a browser would load and execute one script at a time. Modern browser will load them in parallel, while still maintaining execution order.

So why might this matter?

Imagine if the first script request changed the application state, perhaps setting a cookie or something to authenticate further requests. With the new parallel loading, those scripts would be requested before this state was changed, and assuming the web application is well-enough designed, throw an error.

So the only way to ensure the other resources didn't error because the script did not have a chance to change the state before they were requested is to re-request the resources again.

In fact, this re-requesting behavior is not limited to just scripts, and can also be seen to effect images that error after a script tag that was loaded in parallel.

Potentially, because those images may have failed to load because a prior script did not execute first, they are all re-requested in parallel.

Interestingly, I can't find anything directly about this in the spec, but this section from The Living Standard suggests this behavior may actually violate the spec.

For classic scripts, if the async attribute is present, then the classic script will be fetched in parallel to parsing and evaluated as soon as it is available (potentially before parsing completes). If the async attribute is not present but the defer attribute is present, then the classic script will be fetched in parallel and evaluated when the page has finished parsing. If neither attribute is present, then the script is fetched and evaluated immediately, blocking parsing until these are both complete.

If parsing were actually blocked, then it would seem the following script tags and images should not have been read to be able to load. I suspect that the browsers reconcile this issue by not making the following tags available in the DOM until after execution.

Note:

The exact behavior you will see in these cases may vary a bit. Only those resources that were actually requested in parallel with a script will actually be reloaded. If an image afterwards errors, but it was not requested while a script was loading, then there is no need to re-request it. Additionally, it appears Chrome only triggers this behavior if the potentially-state-changing script does not error, however Firefox triggers this behavior even if it does error.

That's what I figured, but I was surprised that every browser I tried had *some* variety of this behavior, but not consistently. Is this actually in a spec somewhere, or is this just some agreed upon thing that browsers do for back-compat? — Jamie Wong, Nov 09 '16 at 01:08
@JamieWong I haven't been able to find anything about it in the spec, just a section that seems to indicate the spec does not allow this. I suspect one vendor started doing it, and others followed suit with their own rules. — Alexander O'Mara, Nov 09 '16 at 01:27
The last note is interesting -- to make sure I understand -- you're saying that Chrome only triggers if at least one of the scripts succeeds, in which case it'll refire all of the subsequently failing requests that were originally requested in parallel? — Jamie Wong, Nov 09 '16 at 02:04
@JamieWong Yeah, or more-specifically, it will retry failed requests that were started while a script was loading, if that script succeeds. — Alexander O'Mara, Nov 09 '16 at 02:08
Re "parsing being blocked", it's possible it's not an explicit violation of the spec because these requests might not come from the parser, they might come from the preload scanner, so you never build an invalid parse tree, but you still send the requests. Even when requested in parallel, the scripts are still executed sequentially (so executing of script 2 blocks on execution of script 1 even if script 2 finishes download before script 1). — Jamie Wong, Nov 16 '16 at 18:22

Why do browsers re-request scripts on non-200 response?

1 Answers1

So why might this matter?

Note: