0

I have a script for testing the status of a site, to then run a command if it is offline. However, I've since realised because the site is proxied through Cloudflare, it always shows the 200 status, even if the site is offline. So I need to come up with another approach. I tried testing the site using curl and HEAD. Both get wrong response (from Cloudflare).

What I have found is that HTTPie command gets the response I need. Although only when I use the -h option (I have no idea why that makes a difference, since visually the output looks identical to when I don't use -h).

Assuming this is an okay way to go about reaching my aim ... I'd like to know how I can test if a certain string appears more than 0 times.

The string is location: https:/// (with three forward slashes).

The command I use to get the header info from the actual site (and not simply from what Cloudflare is dishing up) is, http -h https://example.com/.

I am able to test for the string using, http -h https://example.com | grep -c 'location: https:///'. This will output 1 when the string exists.

What I now want to do is run a command if the output is 1. But this is where I need help. My bash skills are minimal, and I am going about it the wrong way. What I came up with (which doesn't work) is:

#!/bin/bash
STR=$(http -h https://example.com/)

if (( $(grep -c 'location: https:///' $STR) != 1 )); then
        echo "Site is UP"
        exit
else
        echo "Site is DOWN"
        
        sudo wo clean --all && sudo wo stack reload --all
fi

Please explain to me why it's not working, and how to do this correctly.

Thank you.

ADDITIONS:

What the script is testing for is an odd situation in which the site suddenly starts redirecting to, literally, https:///. This obviously causes the site to be down. Safari, for instance, takes this as a redirection to localhost. Chrome simply spits the dummy with a redirect error, ERR_INVALID_REDIRECT.

When this is occurring, the headers from the site are:

HTTP/2 301
server: nginx
date: Thu, 12 May 2022 10:19:58 GMT
content-type: text/html; charset=UTF-8
content-length: 0
location: https:///
x-redirect-by: WordPress
x-powered-by: WordOps
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
x-content-type-options: nosniff
referrer-policy: no-referrer, strict-origin-when-cross-origin
x-download-options: noopen
x-srcache-fetch-status: HIT
x-srcache-store-status: BYPASS

I choose to test for the string location: https:/// since that's the most specific (and unique) to this issue. Could also test for HTTP/2 301.

The intention of the script is to remedy the problem when it occurs, as a temporary solution whilst I figure out what's causing Wordpress to generate such an odd redirect. Also in case it happens whilst I am not at work, or sleeping. :-) I will have a cron job running the script every 5 mins, so at least the site is never down for longer than that.

Donald Duck
  • 8,409
  • 22
  • 75
  • 99
inspirednz
  • 4,807
  • 3
  • 22
  • 30
  • `grep` takes filenames as arguments (i.e. the names of files to search), not raw data. See ["Pass the value of a variable to a command as if it were stored in a file"](https://stackoverflow.com/questions/68875336) for how to do this correctly. Also, don't bother using `grep -c` and then checking the number of matches, just use `grep -q` and check its exit status; see ["How to grep and then fail an if-statement on specific output from grep?"](https://stackoverflow.com/questions/5227428) – Gordon Davisson May 17 '22 at 09:34

1 Answers1

1

grep reads a file, not a string. Also, you need to quote strings, especially if they might contain whitespace or shell metacharacters.

More tantentially, grep -q is the usual way to check if a string exists at least once. Perhaps see also Why is testing “$?” to see if a command succeeded or not, an anti-pattern?

I can see no reason to save the string in a variable which you only examine once; though if you want to (for debugging reasons etc) probably avoid upper case variables. See also Correct Bash and shell script variable capitalization

The parts which should happen unconditionally should be outside the condition, rather than repeated in both branches.

Nothing here is Bash-specific, so I changed the shebang to use sh instead, which is more portable and sometimes faster. Perhaps see also Difference between sh and bash

#!/bin/sh

if http -h https://example.com/ | grep -q 'location: https:///' 
then
    echo "Site is UP"
else
    echo "Site is DOWN"        
fi

sudo wo clean --all && sudo wo stack reload --all

For basic diagnostics, probably try http://shellcheck.net/ before asking for human assistance.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • Thanks for this information. Regarding, `http -h https://example.com/ | grep -q 'location: https:///'` it's hard for me to test this on the command line, as it gives no output either way. Is there something I can add to it for testing purposes, so it has output telling me whether the string was found or not? – inspirednz May 18 '22 at 02:57
  • Also, the `wo ... ` command wasn't meant to be in both conditions. I've corrected the code I had in my example. RE: "The parts which should happen unconditionally should be outside the condition, rather than repeated in both branches." – inspirednz May 18 '22 at 03:45
  • I've tested the script, and it seems to go to "site is down" under all circumstances. Whether the string exists or not. – inspirednz May 18 '22 at 03:47
  • You can remove the `-q` flag to see what is being matched. The exit code from `grep` will always be available in `$?` for inspection (I have a `PS1` prompt which routinely includes the exit status of the latest executed command). – tripleee May 18 '22 at 04:00
  • Without access to the actual URL there is no way for us to test this either. I am somewhat skeptical that checking for three slashes in the Location: header is a reliable indicator of "upness" but we only have your word to go on. – tripleee May 18 '22 at 04:33
  • I would expect most web servers to put the Location: header in proper case; perhaps the problem is as simple as that. – tripleee May 18 '22 at 04:36
  • Also, your question says `redirect:` in one place but `location:` in another. If you meant `redirect:` then obviously put that instead. – tripleee May 18 '22 at 04:37
  • It should have said `location` not `redirect`. Corrected now. I've also added some additional explanation. I can confirm the string is in lowercase. Providing the actual URL won't be useful, because when the issue is occurring, the site goes down, and I need to get it back up ASAP. I hope this helps clarify the concerns you've raised. — I'll look into how I can get my prompt to display the exit codes. Thanks for that tip. – inspirednz May 18 '22 at 05:07
  • I created a fresh Ubuntu Docker instance and installed `httpie` and `netcat`. I'm getting "unknown protocol error" from `http -h` for the headers you provided. `http: error: ConnectionError: ('Connection aborted.', UnknownProtocol('HTTP/2')) while doing GET request to URL: http://localhost:8888/` Of course the `grep` (or anything else) on the result will not work then because it does not provide any output at all. – tripleee May 18 '22 at 05:32
  • Thanks for going to such lengths to test it out. The headers I provided are exactly what `http -h` output (when site is down), and different set of headers when site is up. I simply copy / pasted them from my terminal after running that command. I should add, the reason I decided to assign the output to a string was mostly so I could output it in an `echo` for debug purposes. It for sure outputs what the sample I added to my question. – inspirednz May 18 '22 at 05:37
  • Which platform are you on? The documentation I'm looking at says `http -h` should just display the help message. – tripleee May 18 '22 at 05:53
  • `http -h https://url/ | tee /dev/stderr | grep` would let you see what's being piped to `grep` – tripleee May 18 '22 at 05:53
  • `curl -i https://example.com` also produced the same output. But I ran into an issue with trying to work with the output (I forget what ... but it lead me to using http (httpie) ... where I again ran into an issue ... but then `http -h` did the trick. I'm on Ubuntu 20.0.4. `-h` was to `Print only the response headers`. I have no idea why that made a difference in my ability to grep it, etc. From httpie help ... `--headers, -h Print only the response headers. Shortcut for --print=h.` – inspirednz May 18 '22 at 05:59
  • Okay. I see why I used `http -h`. Without it, I get more than just the headers ... so that simply kept the output limited to what I actually needed to work with. – inspirednz May 18 '22 at 06:04
  • My Docker image is Ubuntu 20.04 and the package I installed was named `httpie`. I can't repro with the details you have provided. (The `-h` option appears to work as you describe, as such.) – tripleee May 18 '22 at 06:08
  • Sounds like we're using the same environment then. If `-h` doesn't work on your instance, try `--headers`, which is what my system indicates it's an abbreviation of. Or even `--print=h` – inspirednz May 18 '22 at 06:10
  • It does work, but the headers you provided cause the client to print an error and no output. – tripleee May 18 '22 at 06:12