-1

I am using Geektools (bash desktop widget app for mac) to try and display text from a website. I have been trying to cURl the site and then grep the text but I am finding it more difficult than I imagined. Just looking for some help.

HTML:

     <div is>
    <div class="page-status status-none">
      <span class="status font-large">
        All Systems Operational
      </span>
      <span class="last-updated-stamp  font-small"></span>
    </div>

Above is the span that is displayed when I cURL the site. I just need to display the text "All Systems Operational".

Thank in advance for your assistance.

3 Answers3

1

getting in the habit of using regular expressions with html is a slippery slope; it's not the right tool for the job, as mentioned here; I'd suggest either

both of which let you use css3 selectors to target content in your input

for example:

curl -s $website_url | hxselect '.status.font-large' 
All Systems Operational
Community
  • 1
  • 1
user3276552
  • 1,074
  • 12
  • 15
0

You could pipe the output of curl into gawk. This gawk command seems to do the trick (I'm using Cygwin's gawk on Windows):

gawk "/status font-large/ {wantedLine=NR+1} {if (NR==wantedLine) {print $0}}"

BillS
  • 85
  • 7
  • And if you want leading/trailing spaces removed from the displayed line you can add $1=$1; before the if: "/status font-large/ {wantedLine = NR + 1} {$1=$1; if (NR == wantedLine) {print $0}}" – BillS Oct 06 '15 at 23:28
  • Thanks for your help. I think gawk will get me what I need. However, when I use it, it doesn't return anything. Not even an error. If I change it to awk it will display a 0. – rubberfishstudios Oct 07 '15 at 20:28
  • I see that wantedLine is a variable but I don't see how it is being defined. Do I need to define /status font-large/ as the variable? – rubberfishstudios Oct 07 '15 at 20:29
  • Yes, wantedLine is a variable. In awk there's no need to define/declare variables. If referenced before they exist, they return the value 0 or the empty string depending on their usage. Perhaps on a Mac you need to use ' instead of " around the program text? I'd try that first. Do any of those other characters have special meaning to a Mac command line? – BillS Oct 09 '15 at 02:30
  • Correct me if I am wrong but the code appear to break down as such; status font-large = NR+1, which means that the row number that status font-large is found on is equal to the NR or entire text plus 1 line. If NR == status font-large, which it wont because it is on line 359 of 1400 lines, then print. – rubberfishstudios Oct 09 '15 at 21:05
  • Using this "Control Statements in AWK" as a reference on this site https://developer.apple.com/library/mac/documentation/OpenSource/Conceptual/ShellScripting/Howawk-ward/Howawk-ward.html. I have created this code: curl $website_url | awk {if ($0 ~ /status font-large/) {print $0;} else {print "NOMATCH: ";}} I have tried with and without quotes. It produces this error -bash: syntax error near unexpected token `(' – rubberfishstudios Oct 09 '15 at 21:20
  • NR always represents the current record (line) number. As awk processes this script, for each line it: (1) searches for "status font-large". When it finds a line which contains that, it sets wantedLine to NR + 1. (2) It checks to see if the the current line (NR) is equal to wantedLine. If it is, it prints the current line. – BillS Oct 09 '15 at 22:59
  • Try a test. Does the following print all lines from the file from the site? curl $website_url | awk '{print $0}' – BillS Oct 09 '15 at 23:04
  • Thank you for your continued support BillS. When I tried the test it repeated a 0 on every line. Since I know that the text is on line 593 I placed $593 in there and it repeated 93 on every line. – rubberfishstudios Oct 11 '15 at 15:08
  • Let's try putting the script into a file instead of specifying it on the command line. Put "{print $0}" -- excluding the quotes -- into a file. Let's call it test.awk. Then the awk portion of the command line would be "awk -f test.awk" (again, exclude the quotes). Does it still print many lines containing nothing but 0? – BillS Oct 12 '15 at 13:36
  • In the test, it produced the error `0curl: (23) Failed writing body (0 != 1448)`. It didn't produce anything in the bash widget. Also I noticed that the repeating zeros, it also has a block of text half way down: -0 -0 100 -0 56836 1-0 00 5-0 6836-0 0 -0 0 -0 4-0 03-0 k-0 -0 -0 -0 0 --:---0 :-- --:-0 --:-0 -- --:-0 --:---0 408-0 k -0 -0 -0 – rubberfishstudios Oct 12 '15 at 17:40
  • Ok, I got the test to work `curl $Website_URL | awk "{print $0}"` it was slanting my quotations and I needed them to be straight. However, it is still repeating a 0 on each line. – rubberfishstudios Oct 12 '15 at 20:58
  • Something is wrong/broken. Not supposed to work that way. Sorry I can't be of more help. – BillS Oct 13 '15 at 00:26
0

I was able to parse out the status that I was looking for by using Nokogiri.

curl -s $Website_URL | nokogiri -e 'puts $_.at_css("span.status").text'