How to extract a value from json file in unix?

Question

I have the below json content in my sample file:

{
    "listingRequest": {
        "id": "016a1050-82dc-1262-cc9b-4baf3e0b7123",
        "uri": "http://localhost:9090/nifi-api/flowfile-queues/016a104a-82dc-1262-7d78-d84a704abfbf/listing-requests/016a1050-82dc-1262-cc9b-4baf3e0b7123",
        "submissionTime": "04/28/2019 19:40:58.593 UTC",
        "lastUpdated": "19:40:58 UTC",
        "percentCompleted": 0,
        "finished": false,
        "maxResults": 100,
        "state": "Waiting for other queue requests to complete",
        "queueSize": {
            "byteCount": 480,
            "objectCount": 20
        },
        "sourceRunning": false,
        "destinationRunning": false
    }
}

I want to retrieve the value of the byte count i.e. byteCount. The result should be 480.

Using other tools like jq is not allowed to be installed in our ourganization due to restrictions.

How do I do it via sed/grep? I tried grep -Po '"byteCount":.*?[^\\]",' but did not get any output

FYI here, as we all know, `jq` is correct tool for json processing since OP is saying it is not allowed in his case so added back tags in post now. — RavinderSingh13, Apr 29 '19 at 14:12
`sed -n 's/.*"byteCount": \([0-9]*\).*/\1/p' file` might work, but you should do this using a tool which is capable of parsing json — oguz ismail, Apr 29 '19 at 14:15
Just because `jq` can't be used doesn't make `sed` or `awk` any more appropriate. — chepner, Apr 29 '19 at 14:19
@chepner if she can't install `jq` then she's probably restricted to standard UNIX tools so then what would she use for this if not sed or awk? — Ed Morton, Apr 29 '19 at 16:53
You were close with grep, try `grep -Po '"byteCount": \K.+(?=,)'` — nbari, Apr 29 '19 at 17:01
You start making the case that you *need* access to something that correctly parses JSON. "Standard" doesn't mean "you'll never need anything else". It's by definition a lowest common denominator. — chepner, Apr 29 '19 at 17:02
Yeah but for some of us that's all we have or will get to work with so we just have to make do. Fortunately when parsing "json" or "html" or "csv" or any other format it's usually one specific layout of text we need to parse, often generated by some other software we also own, so then the workaround code we need to write only has to be able to parse THAT and not the full language with all it's possible twists and turns and so it's usually a pretty simple task. — Ed Morton, Apr 29 '19 at 17:12

Ed Morton · Accepted Answer · 2019-04-29T17:52:44.623

$ sed -n 's/.*"byteCount": *\([0-9]*\).*/\1/p' file
480

More generally you could use this (using any POSIX awk) to convert your specific format of JSON to a flat file and then print whatever you want by it's tag hierarchy:

$ cat tst.awk
{ gsub(/^[[:space:]]+|[[:space:]]+$/,"") }

match($0,/^"[^"]+"/) {
    subTag = substr($0,RSTART+1,RLENGTH-2)
    $0 = substr($0,RSTART+RLENGTH)
}

!NF || /^{/ { next }

/^:[[:space:]]*{/ {
    preTag = (preTag=="" ? "" : preTag ".") subTag
    next
}

/^}/ {
    sub(/\.[^.]+$/,"",preTag)
    next
}

{
    gsub(/^[[:space:]]*:[[:space:]]*|[[:space:]]*,[[:space:]]*$/,"")
    tag = preTag "." subTag
    val = $0
    printf "%s=%s\n", tag, val
}

.

$ awk -f tst.awk file
listingRequest.id="016a1050-82dc-1262-cc9b-4baf3e0b7123"
listingRequest.uri="http://localhost:9090/nifi-api/flowfile-queues/016a104a-82dc-1262-7d78-d84a704abfbf/listing-requests/016a1050-82dc-1262-cc9b-4baf3e0b7123"
listingRequest.submissionTime="04/28/2019 19:40:58.593 UTC"
listingRequest.lastUpdated="19:40:58 UTC"
listingRequest.percentCompleted=0
listingRequest.finished=false
listingRequest.maxResults=100
listingRequest.state="Waiting for other queue requests to complete"
listingRequest.queueSize.byteCount=480
listingRequest.queueSize.objectCount=20
listingRequest.sourceRunning=false
listingRequest.destinationRunning=false

$ awk -f tst.awk file | awk -F'=' '$1=="listingRequest.queueSize.byteCount"{print $2}'
480

score 0 · Answer 2 · answered Apr 29 '19 at 14:21

I think you could count the characters

a= your_json
b="byteCount"
strindex() { 
  x="${1%%$2*}"
  [[ "$x" = "$1" ]] && echo -1 || echo "${#x}"
}
index=  strindex "$a" "$b"  #print 4
result ={your_json:(index+11)}

Sources: https://www.tldp.org/LDP/abs/html/string-manipulation.html

Position of a string within a string using Linux shell script?

How to extract a value from json file in unix?

2 Answers2