0

Hereafter we take only ipv4 into account. While looking for a way to block a certain website without any browser plugin, I found this question: Blocking Websites with /etc/hosts. According to the accepted answer, just adding

0.0.0.0  <domain>

to /etc/hosts can block an access to the domain <domain>. And this worked like a charm. But why does this work that way?

Experiments

Assume a website X.com loads some scripts from Y.com, and I added 0.0.0.0 Y.com to /etc/hosts to block the scripts to be loaded.

  1. When I directly type Y.com or 0.0.0.0 in the browser's address bar, it is routable and leads me to my local website localhost:8080.

  2. However, when I access X.com, Y.com is totally blocked. By "totally" I mean Y.com cannot be accessed not due to Timeout nor Not Found nor Forbidden; as far as I inspect Network Monitor in Firefox, the browser doesn't even try to access Y.com in the first place†1, which implies the meaning of 0.0.0.0 here is different from the first result above.

†1: Or, perhaps, it actually tries to access Y.com but returns instantly with no result. If so, I don't understand why it returns instantly without waiting for (dozens of) seconds for timeout.

Questions

  1. In this case, who interprets 0.0.0.0? A browser?

  2. Why does a indirect reference to 0.0.0.0 (Experiment 2) mean "this should be blocked" while a direct access to 0.0.0.0 (Experiment 1) means "this should be connected to the local website"?

0.0.0.0 - Wikipedia gives me a hint, but it doesn't explain in which context a certain meaning is chosen.


Environments:

Firefox 77.0.1 on Arch Linux


My Guess

After posting this question, I did some tests to find one fact:

Although many blog posts and answers on this website say 0.0.0.0 <domain> can be used to block <domain>, actually it does not block the domain. Strictly speaking, it depends. Like a normal entry in /etc/hosts, 0.0.0.0 <domain> just converts an access to <domain> to an access to 0.0.0.0.

Because

  1. 0.0.0.0 is same as localhost in this context†2

  2. and an access to 0.0.0.0 is instantaneous†3

, as far as you are not running a webserver on the host, 0.0.0.0 <domain> effectively blocks an access to <domain>.

When you are running a webserver,

  1. An access to <domain>/<file> is effectively blocked if localhost/<file> doesn't exist. Note, however, since the webserver is accessed and returns 404, numerous accesses to <domain>/<file> may slow down your computer.

  2. An unexpected result is observed if localhost/<file> does exist. If you are lucky, it just break the layout of a website. But generally it can be very dangerous.

So, in my guess, 0.0.0.0 <domain> is nothing more than a workaround; it works under limited environments.

†2: I don't yet understand why. Suspected reason: What does Chrome/server do when I use 0.0.0.0 instead of localhost in browser?

†3: For example, ping -c 1 0.0.0.0 returns in a moment. I don't know why. (Perhaps just because an access to a local interface is very fast?)

ynn
  • 3,386
  • 2
  • 19
  • 42
  • Your guess is right: using 0.0.0.0 (=localhost) to block websites is just a trick that works for most *client* machines, as they most likely do not have a webserver listening on HTTP/HTTPS ports. You could use a non existing IP address as well, but using 0.0.0.0 is much faster because your computer already knows there is no web server listening, so the connection fails immediately (instead of having to wait for a timeout when using a non existing IP address). If you have a webserver on your computer, configure it to reject requests for domains it does not recognize to be able to use that trick – Tey' Mar 22 '21 at 16:11

1 Answers1

0

I'm on Artix linux here. 0.0.0.0 is non-routable for me. The linked wikipedia article says that 0.0.0.0 is a 'non-routable meta address'. It sounds like there's something in your configuration that is doing something (possibly) non-standard, resulting in a direct request for 0.0.0.0 or a website bound to that ip in /etc/hosts going to localhost. That makes sense if it's a 'meta-address' even though it's 'non-routable'. 'meta-address' implies some flexibility with respect to reference. 'non-routable' seems to be a very inflexible notion. But if you look at this SE post, it may actually be a little fuzzier than that:

https://networkengineering.stackexchange.com/questions/40328/what-is-a-routable-ip

I would guess that 'non-routable' in the general case just means you can't have a machine routing packets using that address. There's no reason why merely having it redirect to localhost should cause any problems. But as I said, on my artix linux machine, it appears to just point to nothing - and this is probably the standard behavior.

EDIT - according to RFC 8190 0.0.0.0 refers to "this network". The older RFC 6890 says it means "this host, this network":

https://www.rfc-editor.org/rfc/rfc8190

https://www.rfc-editor.org/rfc/rfc6890

So it sounds like using 0.0.0.0 to refer to localhost is perfectly valid.

Community
  • 1
  • 1
JMW
  • 261
  • 2
  • 7
  • Are you running a webserver on the host? – ynn Jun 21 '20 at 17:40
  • No, I'm not running a webserver. – JMW Jun 21 '20 at 19:13
  • If so, your environment doesn't match the one used in OP. What does occur if you execute `mkdir test && cd test && echo "Hello World" > index.html && sudo python -m http.server 80 --bind 127.0.0.1` and access `http://0.0.0.0/` in a browser (e.g. Firefox)? If it's not routable, yes, some of my settings may be wrong. (If you don't know the meaning of the python command, please see [*http.server - Python 3 Documentation*](https://docs.python.org/3/library/http.server.html).) – ynn Jun 22 '20 at 04:07
  • Yep, it works. And I found some more information - 0.0.0.0 refers to "local network" according to this document: https://tools.ietf.org/html/rfc8190 – JMW Jun 22 '20 at 22:16