0

This question arrises from the following post:

Elasticsearch Bulk JSON Data

jq -c -r ".[]" C:\setting-es.json | while read line; do echo '{"index":{}}'; echo $line; done > bulk.json

The above jq shell command is throwing error "Missing statement body in do loop"

I have tried to change syntax around but still it is not working. I am trying to write a shell script to transform the following data for elasticsearch's bulk api:

[{
    "codeId": "111",
    "association": [{
        "associationId": 123,
        "businessUnitsAssociationId": 1,
        "financialBusinessUnits": "DCS",
        "businessApprovalLimit": [{
            "businessApprovalLimitApprovalLimitId": 1,
            "itemMinAmount": "0.00",
            "itemMaxAmount": "0.00"
        }, {
            "businessApprovalLimitApprovalLimitId": 2,
            "itemMinAmount": "0.00",
            "itemMaxAmount": "0.00"
        }, {
            "businessApprovalLimitApprovalLimitId": 3,
            "itemMinAmount": "0.00",
            "itemMaxAmount": "0.00"
        }]
    }]
}]

I am trying to transform it to the following:

{"index":{}}
[{"codeId":"111","association":[{"associationId":123,"businessUnitsAssociationId":1,"financialBusinessUnits":"DCS","businessApprovalLimit":[{"businessApprovalLimitApprovalLimitId":1,"itemMinAmount":"0.00","itemMaxAmount":"0.00",},{"businessApprovalLimitApprovalLimitId":2,"itemMinAmount":"0.00","itemMaxAmount":"0.00",},{"businessApprovalLimitApprovalLimitId":3,"itemMinAmount":"0.00","itemMaxAmount":"0.00",}]}]


Dmitry
  • 1,275
  • 1
  • 5
  • 14
cluis92
  • 664
  • 12
  • 35
  • values `True` and `False` spelled as invalid JSON values (in JSON those spelled lower case entirely) intentionally or mistakenly? – Dmitry Nov 17 '19 at 20:38
  • @Dmitry i have fixed it but regardless this is just dummy data – cluis92 Nov 17 '19 at 20:59
  • @Dmitry how does jtc know how each json data is brokern up? In my dummy data, I am having five data points, how does it know how each is being separated? I ask this because my **actual** dataset has nested values with each data point separated.. I can share if needed – cluis92 Nov 17 '19 at 21:02
  • in your example it's a stream of JSONs, i.e., you have listed 5 standalone JSONs (and `jtc` processes a stream of JSONs with the option `-a`). If your actual data are nested, then the query would be different (does not matter if it's a jq or `jtc`), please share the correct snippet of the input data then. – Dmitry Nov 17 '19 at 21:05
  • I have changed my data @Dmitry to reflect the nested structure.. basically i am having fields nested twice, such as itemMinAmount, and every json data point begins with { "codeId" : "some value" .. – cluis92 Nov 17 '19 at 21:20
  • The JSON sample is now invalid -- there are two extraneous commas. Please fix. – peak Nov 17 '19 at 21:26
  • 1
    @cluis92, in such case, the same `jtc` would work (but `-a` option could be removed, it's redundant as now it's a single JSON). – Dmitry Nov 17 '19 at 21:27
  • I am actually having around 1000 lines of the same styled json but am not able to share on SO as it would be too long – cluis92 Nov 17 '19 at 21:38
  • 1
    @cluis92, right, but it's not needed, showing a snippet which would suffice explaining the input concept and allowing building a correct solution is enough. – Dmitry Nov 17 '19 at 21:41
  • using your jtc command ```jtc: – cluis92 Nov 17 '19 at 21:54

2 Answers2

1

Here's an answer to the revised question, after correcting the invalid JSON (i.e., after removing two superfluous commas).

There is still no need for a shell loop.

At a bash or bash-like prompt:

jq -c '.[] | ({"index":{}}, [.])'  input.json

At a Powershell prompt, it might be easier to place the jq program into a file, and invoke jq with the -f FILENAME option.

peak
  • 105,803
  • 17
  • 152
  • 177
  • 1
    there's more than superfluous commas in the sample json, there's missing curly and square brackets. I have fixed it, but it's pending a review. – Dmitry Nov 17 '19 at 21:35
  • @peak so would this jq command be run from a shell script, such as windows powershell? Here is what I tried ```jq -c '.[] | ({"index":{}}, [.])' jq -c '.[] | ({"index":{}}, [.])' @C:\Users\chris\dummy-json.json``` and it is saying 'Could not open file jq' in powershell – cluis92 Nov 17 '19 at 21:47
  • thank you @peak marked as accepted.. this was what worked for me `jq -c '.[] | ({"index":{}}, [.])' activity-es-jq.json > bulk-activity.json` – cluis92 Nov 18 '19 at 17:23
0

[This response was based on the original question.]

There's no need for any shell loop:

$ jq -c '{"index":{}},.' input.json
{"index":{}}
{"str field":"some string","int field":12345,"bool field":true}
{"index":{}}
{"str field":"another string","int field":42,"bool field":false}
{"index":{}}
{"str field":"random string","int field":3856452,"bool field":true}
{"index":{}}
{"str field":"string value","int field":11111,"bool field":false}
{"index":{}}
{"str field":"last string","int field":54321,"bool field":true}
peak
  • 105,803
  • 17
  • 152
  • 177
  • so the reason I am needing a loop is because my data is much larger than the above and it is not having {"index":{}} as headers between each data point (its technically not valid json, but this is the only way elasticsearch will accept the data..) – cluis92 Nov 17 '19 at 21:26