0

I'm trying to produce multiple output files from a large JSON file of many objects which can be identified by $var1.

My 'application' splits arrays within each json object into multiple rows of text which I can test with awk to get records within a period.

From a long list of $var1, and fixed $unixtime1 and $unixtime2 for a single pass, what would a bash FOR loop look like that would pass in these variables?

cat data.json | grep $var1 | application | awk -F '$10 > $unixtime1 && $10 < $unixtime2' > $var1.txt

I could use jq to obtain a list of var1, but cannot use jq for the entire operation because of some jq limitations with large integers.

jq -c .var1 data.json | sort | uniq -c

As an occasional procedure, I've chosen not to modify my 'application' for the task (yet) - but I appreciate the experienced expertise out there.

The following for loop does the first bit - But how do I introduce the timestamps as variables?

test.txt
351779050870307
351889052023932

#! /usr/bin/env bash
for i in `cat test.txt`
do
cat data.json | grep $i | application | awk -F "\"*,\"*" '$10 > 1417251600000 && $10 < 1417338000000' > $i.txt
done
user2314105
  • 173
  • 1
  • 12
  • You would want either a `for i in $var1; do ... done` (unquoted to allow word splitting) or a trick with `printf` and an `array` to split `$var1` up into elements of an array (e.g. `myarray=( $(printf "%s\n" "$var1" ) )` ) then you could manipulate the array to get what you need. -- It's hard to say more without seeing what is in `$var1`. – David C. Rankin Dec 01 '14 at 07:20
  • Var1 is a 15 digit string - {"Imei":"351579052654744", – user2314105 Dec 01 '14 at 08:17
  • The value of `var1` starts with o contains a `{`? – Jdamian Dec 01 '14 at 08:53
  • Sorry - $var1 (or $i) is tested against source JSON object to identify the many JSON objects with this field value - The 'application' then splits the many arrays in the objects for filtering with awk. Only a quick introduction to Bash scripting required – user2314105 Dec 01 '14 at 09:13
  • introducing the timestamps as variables? Not sure what you mean... Are those two long numbers already the timestamps? Or are you trying to convert timestamps in some certain format to other formats (like 2014-01-01-11:30:01 to "seconds since 1970-01-01 00:00:00 UTC"? It's better to give an example of your input and the expected output. – Robin Hsu Dec 01 '14 at 09:50
  • After 2 hours and a half I still do not know what is the main problem -- the `bash` script? the `awk` code? `timestamps as variables`... but where? in `bash` code? in `awk` code? – Jdamian Dec 01 '14 at 09:51
  • I want to change $unixtime1 and $unixtime2 on the awk component. This could be by user input or other means - just looking for an example / ideas on how to declare these variables for use within a bash script once only for use in the loop – user2314105 Dec 01 '14 at 13:31
  • Using `grep` to extract content from a JSON file is horrid. There are tools actually built for the job, and good at it -- `jq` is one, `jsawk` another. `grep` is line-oriented, so it's very very sensitive to things that the JSON standard doesn't specify one way or another -- whether spaces or newlines are used in formatting, for instance. – Charles Duffy Dec 02 '14 at 04:34
  • jq cannot be used with large decimals - jq parses numbers into C doubles. It then formats them back as numbers. C doubles have finite precision. Eg echo '18302628978110292481' | jq - If you have an answer for this please answer http://stackoverflow.com/questions/27211870/how-can-i-avoid-jq-truncating-long-decimal – user2314105 Dec 02 '14 at 22:44

2 Answers2

1

I THINK this is what you're looking for:

while IFS= read -r i
do
    grep "$i" data.json | application |
    awk -F '"*,"*' -v ut1="$unixtime1" -v ut2="$unixtime2" '$10 > ut1 && $10 < ut2' > "$i.txt"
done < test.txt
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • that's a nice improvement by using WHILE instead of FOR, and no need to concatenate a single file - but the problem remains. How can the unix time come from outside of the while loop into the awk statement as either user input or a file rather than a vi edit? I thought it might be an easy couple of stack overflow points for someone, but it seems I may not have formed the question very well? – user2314105 Dec 02 '14 at 02:32
  • Excatly - we just dont know what it is you're asking for. If you want to read the values from a file then go a head `read var < file`. If you want to pass it as an arg well go ahead and assign it `var=$1`. But that's all too obvious so there must be SOMETHING else that's causing you a problem but I for one have absolutely no idea what it is. – Ed Morton Dec 02 '14 at 03:13
1

For those with expertise in things other than BASH (like me) - I hope it helps you with your small script experiment

Here is an example of inserting multiple variables into a bash script that contains a loop and mixes multiple piped functions - The example includes an example of a user prompt, a calculation and a list (test.txt) as input.

The while loop seems tighter than a for loop (from previous answer) (but look up for vs while logic)

The interesting syntax of the awk function comes from the previous answer without explanation, but works

#! /usr/bin/env bash
read -e -p "start time?" Unixtime1
Unixtime2=$(echo $Unixtime1 + 86400000 | bc)
while IFS= read -r i
do
grep "$i" data.json | application |
awk -F '"*,"*' -v ut1="$Unixtime1" -v ut2="$Unixtime2" '$10 > ut1 && $10 < ut2' > "$i.dat"
done < test.txt
user2314105
  • 173
  • 1
  • 12