0

Suppose I have the following stdout output from an expression

 Min 1st_Qu Median Mean 3rd_Qu Max NAs 
  1.000 1.000 2.000 1.875 2.250 3.000       1 

And the goal is:

{ "Min":1.0, "1st_Qu":1.0, ..., "NAs":1 }

Is there a trivial way to do this with bash? (also trying to limit dependencies, and would prefer to just cherry pick what I need using awk over adding any non-native linux dependencies).

peak
  • 105,803
  • 17
  • 152
  • 177
Chris
  • 28,822
  • 27
  • 83
  • 158
  • (1) Do you need the code to read the headers? (2) If so, how is it supposed to distinguish between the spaces internal to `1st Qu.` and the space *between* `1st Qu.` and `Median`? Or is that actually a tab? (You don't appear to have a hard guarantee that headers will always have two or more spaces between them). – Charles Duffy Sep 12 '19 at 19:07
  • @CharlesDuffy got the hang of the R package manager, but, for practical purposes, I think this is very useful. In this case, let's suppose we have tab delimitations between columns. Or, better yet, no spaces: is it possible to associate the header name as "key" with the columnar value as "value" in a json object with a trivial command line expression? – Chris Sep 12 '19 at 19:10
  • In your example, columns are 7 characters wide. Is it always the case? – mouviciel Sep 12 '19 at 19:12
  • @mouviciel let me restructure it for easier manipulation – Chris Sep 12 '19 at 19:12
  • Now that you've edited it, I'm ~80% certain we have an effective duplicate already in the knowledgebase. [Edit: Found it!] – Charles Duffy Sep 13 '19 at 00:33

1 Answers1

1

The following might count as trivial but it does not preserve the precision of the numbers. It assumes the separator is the regex " +", but you could easily change that, e.g. to "\t" if the value separator is a tab:

jq -n -R '[inputs | [splits(" +") | select(length>0)]]
  | transpose | map({(.[0]): .[1]|tonumber}) | add'

Handling multiple data rows

The following assumes a jq -nR invocation:

def zip(headers):
  . as $in
  | reduce range(0; headers|length) as $i ({}; .[headers[$i]] = ($in[$i]) );

def s2a: [splits(" +") | select(length>0)];

(input | s2a) as $h
| inputs | s2a | map(tonumber) | zip($h)
peak
  • 105,803
  • 17
  • 152
  • 177