1

The problem:

I want to make sure that a certain string appears in a file and also that another string does not appear in the file. If both conditions are met, the command(s) should generate some output.

Here's what I started with:

I have a cronjob that periodically downloads a web page with curl. I wanted to be notified whenever a certain text ("inStock':'True") appears in that web page on one line. This part was easy and works well. Here's the cronjob I used:

curl --silent --cookie "myStore=true; storeSelected=131; ipp=25; SortBy=match; rearview=501552" http://www.microcenter.com/product/501552/AIY_VISION_KIT | grep "inStock':'True"

Because this runs as a cronjob, whenever "grep" produces any output (such as "'inStock':'True',"), I will receive an email.

Now another issue came up: If the text ("This product is no longer available") appears on another line of the web page, I don't want to be notified after all.

Any good solutions? It doesn't have to be grep, awk or perl would also be fine.

Here is an example file example.txt that we can use instead of depending on the particular webpage and curl:

This product is no longer available
'inStock':'True',

So if I run

cat example.txt | grep "inStock: 'true"

it will output

inStock: 'true'

no matter what other lines are in the file. What I want is a command (or multiple commands) that produce no output if the another line in the file contains the text "This product is no longer available".

neuhaus
  • 3,886
  • 1
  • 10
  • 27
  • Please be more clear in your question, as it is not clear at all. – RavinderSingh13 Jan 29 '18 at 15:05
  • 2
    See [ask], then update your question to include the missing [mcve] so people can help you. – Ed Morton Jan 29 '18 at 15:05
  • I changed the question to contain my concrete problem. And I have a sample solution that isn't very good. – neuhaus Jan 29 '18 at 15:07
  • You still haven't provided the missing sample input and expected output though. You're asking for a tool whose input is the output from curl and whose output is **some output** to go in an email. So show us a concise, testable example with a sample of the output from curl (i.e. the input to the tool you want to write) and the output you'd want the tool to produce given that input. – Ed Morton Jan 29 '18 at 15:14
  • As soon as the oneliner used in the crontab creates *any* output, cron will send an email. That's what I want. But only if both conditions match (one string is present, the other one is not present). – neuhaus Jan 29 '18 at 15:19
  • Yes, you have said that so it's clear you want to read some input and generate some output, but you're asking us to help you parse text that you haven't shown us. Again - read [ask] and in particular the part about providing a [mcve]. – Ed Morton Jan 29 '18 at 15:23
  • You can run the provided example. It will detect string X and produce output. It does not currently deal with the second condition ("does not contain string Y"). It will produce output regardless. That's the part I needed help with. – neuhaus Jan 29 '18 at 15:25
  • Alternatively - **you** can run the provided example and include the relevant output in your question. Guess which assumed approach is more likely to result in you getting an answer :-). Also running one command won't produce the different combinations of inputs that you care about - with/without "inStock", with/without "no longer available", etc. Put a little effort into asking the question and you'll get far more people willing to help you - assume our time is very limited so the more time YOU spend on asking the question the better for you. Last time - see [ask]. – Ed Morton Jan 29 '18 at 15:30
  • 1
    @EdMorton OK, I have provided an example file instead of relying on curl. – neuhaus Jan 29 '18 at 17:02

1 Answers1

2

I came up with this awk script that I pipe the web page into with curl. It's kinda ugly so I hope I get a better answer from someone else.

So I want the string "no longer available" NOT to be present however I do want the line "inStock" to be present. I don't know in what order they will appear in the file.

Here is the script:

awk '/no longer available/ { a=1 } /inStock/ { b=1} END{ if(!a && b) { print("conditions matched")} }'

I guess using a multiline grep match would also be an option. It might use a lot of memory. It will also be complicated due to the fact that I don't know the order in which the strings will appear in the web page.

neuhaus
  • 3,886
  • 1
  • 10
  • 27
  • 1
    What you have is the right approach. You could add `; exit` after `a=1` for efficiency if that's an issue. – Ed Morton Jan 29 '18 at 17:06
  • 1
    Good idea. It will still execute the END block but it no longer has to process the rest of the input. – neuhaus Jan 29 '18 at 17:07
  • Right. You mentioned the possibility of doing a multiline grep match - there is no such thing and while you CAN do it with awk in various ways you're right it could use a lot of memory and there's really no point when what you already have is simple, efficient, portable, etc. – Ed Morton Jan 29 '18 at 17:11
  • [Here](https://stackoverflow.com/questions/3717772/regex-grep-for-multi-line-search-needed) is an example for a multiline grep. It won't work with all versions of grep but gnu grep (which is what I use) supports it and is widely available. – neuhaus Jan 29 '18 at 17:12
  • Ah, a GNU-ism. I don't know who is hacking away at GNU grep but I wish they'd get their act together - that's completely useless and then they throw in -P and just declare it to be "highly experimental" (seriously - see the man page) so when it core dumps they can just say "ah well" and then they add options to find files when there's a perfectly good tool named "find" for doing that. They've turned GNU grep into just a big, convoluted mush of nonsense completely contravening the UNIX approach of every tool doing 1 thing well. – Ed Morton Jan 29 '18 at 17:17
  • `-z` is the option that enables multiline. I consider it to be quite useful at times. YMMV. – neuhaus Jan 29 '18 at 17:18
  • Yeah, I get it but that assumes there's no NUL chars in the input file, forces your output to be NUL-terminated which makes it non-POSIX (and so trying to parse it with any subsequent tool is relying on undefined behavior) and it forces you to read the whole file at one time. Awk has much better ways of dealing with multi-line searches so introducing that to GNU grep is just giving people a bad way to do something when a good way to do it already exists. – Ed Morton Jan 29 '18 at 18:03