2

I have a json and I want to grep the website url (http://mywebsite.com), how do i grep that using shell script.

P.S: I know there are tools like 'jq' which could make it easier but I want to do it using sed/awk/grep utilities.

eg: test.json

{
  "name"       : "xyz", 
  "age"        : "25",
  "websiteurl" : "http://mywebsite.com" 
}

So far I have tried;

cat test.json | grep -i website* | cut -d ':' -f2

Output:

"http

But when I run the above command as shown above, it also seperates the colon (:) between http and double slash(//) which I dont want. I want the whole url to be stored in a variable.

skyrocker
  • 199
  • 2
  • 8
  • 20
  • 1
    You don't want to use sed/awk/grep to parse json. You *do* in fact want to use jq. While you may indeed find people willing to provide answers that help you do this the wrong way, it won't make it any less the wrong way to do this. – ghoti May 30 '17 at 02:01
  • Oddly, the question that this is a duplicate of has almost identical JSON. I wonder if they're part of the same course. @skyrocker, can you tell us where this came from? – ghoti May 30 '17 at 02:05
  • @ghoti You are correct. I was referring to the same post that you have talked above in your comment as it appeared to be the first response to my google search – skyrocker May 30 '17 at 02:21
  • @ghoti however may be the json is similar but my question is different from https://stackoverflow.com/questions/38364261/parse-json-to-array-in-shell-script. – skyrocker May 30 '17 at 02:23

3 Answers3

3

Well, if you are going to do it wrong (like not using jq), at least do it less wrong

awk '/website/ {gsub("\"", "", $3); print $3}' test.json

Explanation

awk splits the input into fields, so here $3 is the 3rd field (1 based) for lines matching website. Then quotes are removed (if present) and result printed.

Diego Torres Milano
  • 65,697
  • 9
  • 111
  • 134
  • can you please explain me how $3 works in above example? – skyrocker May 30 '17 at 02:33
  • There are three whitespace-separated tokens in your example (the second is the lone `:`); this takes the third one, and discards double quotes around (and actually also within) it. – tripleee May 30 '17 at 04:01
0

If jq were an option, the solution would be as simple as:

$ jq .websiteurl < example.json
"http://mywebsite.com"

If jq cannot be made available in your environment, and you want a solution in bash alone, JSON.sh should do the trick:

$ curl -s -O https://raw.githubusercontent.com/dominictarr/JSON.sh/master/JSON.sh
$ declare -A result=()
$ while IFS=$'\t' read -r key value; do eval result$key="$value"; done < <(sh JSON.sh -n < ex.json)
$ declare -p result
declare -A result=([websiteurl]="http://mywebsite.com" [name]="xyz" [age]="25" )
$ printf '%s\n' "${result["websiteurl"]}"
http://mywebsite.com

This isn't particularly good, but it worked in the test I just did. the usage above will fail if $value (the data of any part of your json input) contains a tab.

JSON.sh should work in any POSIX shell, including bash, and contains no external dependencies.

Also note that declare -A (associative arrays) requires bash version 4 or above.

ghoti
  • 45,319
  • 8
  • 65
  • 104
0

Why don't you use quotation mark as awk seperator?

Example:

cat test.json | grep -i website* | awk -F '"' '{print $4}'

This should work.

Samy
  • 629
  • 8
  • 22
  • Though you probably don't actually want to search for `websit`, which is what this regex effectively does; and you could inline the `grep` into the Awk script (though lowercasing in Awk is quite a bit more verbose). – tripleee May 30 '17 at 03:58
  • ... And of course, the [`cat` is useless.](http://www.iki.fi/era/unix/award.html) – tripleee May 30 '17 at 04:04
  • You may be correct if we want to optimize the one-liner as much as possible, but I think this way is more readable and tracable for beginners. Thanks for the explanation anyway. – Samy May 30 '17 at 04:21