99

Can I use wget to check for a 404 and not actually download the resource? If so how? Thanks

mike628
  • 45,873
  • 18
  • 40
  • 57
  • possible duplicate of [Only create file if http status 200 with wget?](http://stackoverflow.com/questions/649647/only-create-file-if-http-status-200-with-wget) – Joris Meys Jun 07 '11 at 12:10

6 Answers6

145

There is the command line parameter --spider exactly for this. In this mode, wget does not download the files and its return value is zero if the resource was found and non-zero if it was not found. Try this (in your favorite shell):

wget -q --spider address
echo $?

Or if you want full output, leave the -q off, so just wget --spider address. -nv shows some output, but not as much as the default.

Shadikka
  • 4,116
  • 2
  • 16
  • 19
  • 36
    Note that `wget --spider` sends a HEAD request, not a GET. – hammar Jun 07 '11 at 12:18
  • 4
    @hammer, I'm not sure what version you might have been using, but with 1.14, `wget --spider` does a HEAD and, if successful, follows with a GET to the same URL. Thus, with the recursive option, it's useful for building the cache for a server-side website. – danorton Jun 27 '14 at 01:42
33

If you want to check quietly via $? without the hassle of grep'ing wget's output you can use:

wget -q "http://blah.meh.com/my/path" -O /dev/null

Works even on URLs with just a path but has the disadvantage that something's really downloaded so this is not recommended when checking big files for existence.

3ronco
  • 552
  • 9
  • 12
  • The `--spider` arg *does* set a return code. But maybe that's because after 4 years 3 months and 7 days, the spider has got smarter. – John Red Jan 06 '17 at 10:37
  • Havn't checked it recently but wouldn't surprise me if they fixed it meanwhile. – 3ronco Jan 16 '17 at 18:36
18

You can use the following option to check for the files:

wget --delete-after URL
Adiii
  • 54,482
  • 7
  • 145
  • 148
Parikshit
  • 543
  • 1
  • 7
  • 16
18

Yes easy.

wget --spider www.bluespark.co.nz

That will give you

Resolving www.bluespark.co.nz... 210.48.79.121
Connecting to www.bluespark.co.nz[210.48.79.121]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
200 OK
John Ballinger
  • 7,380
  • 5
  • 41
  • 51
1

Yes, to use the wget to check , but not download the target URL/file, just run:

wget --spider -S www.example.com
O.Caliari
  • 313
  • 2
  • 5
-6

If you are in a directory where only root have access to write in system. Then you can directly use wget www.example.com/wget-test using a standard user account. So it will hit the url but because of having no write permission file won't be saved.. This method is working fine for me as i am using this method for a cronjob. Thanks.

sthx

Admin Hack
  • 67
  • 2
  • 4
  • 1
    Shouldn't be used... Risky because permissions from system admin can change and break your intention and useless when there's a built-in flag like `--spider` which does exactly what the OP asks – LukeSavefrogs Jun 17 '19 at 09:35