0

I have a JSON file whit this structure. For every URL field, we have a RESULT field which contains hundreds of LINKS. Would it be somehow possible to parse it and obtain a (ie. csv) list which contains all the LINKS for every URL?

 [{
    "url": "https://example.org/yyy",
    "result": "{\"links\":[{\"link\":\"https://example.org/xxx/xxx\",\"text\":\"\"
},
{
    \"link\":\"https://example.org/xxx/xxx\",\"text\":\"\"
},
{
    \"link\":\"https://example.org/xxx/xxx\",\"text\":\"yyy\"}[.......]

Thanks in advance

flapane
  • 543
  • 2
  • 8
  • 21

1 Answers1

0

Here is a solution using jq. If data.json contains the sample data

[{"url": "https://example.org/yyy", "result": "{\"links\":[{\"link\":\"https://example.org/xx1/xx1\",\"text\":\"\"},{\"link\":\"https://example.org/xx2/xx2\",\"text\":\"\"}]}"}]

then the command

$ jq -Mr '.[].result | fromjson | .links[].link' data.json

produces

https://example.org/xx1/xx1
https://example.org/xx2/xx2

If you would like both the url and the links, the command

$ jq -Mr '.[] | .url as $url | .result | fromjson | "\($url),\(.links[].link)"' data.json

produces

https://example.org/yyy,https://example.org/xx1/xx1
https://example.org/yyy,https://example.org/xx2/xx2
jq170727
  • 13,159
  • 3
  • 46
  • 56