0

In gnuplot, how do you plot flattened tables that fit the form of aggregated key-value pairs? For example given this tab-delimited file, how could I plot a bar graph for type=foo, with one bar per version?

type    version count
foo a   1
foo b   2
foo c   3
bar a   3
bar b   2
bar c   1
baz a   0
baz b   2
baz c   2

Extra credit: How would I plot k subplots, one for each type (e.g. foo, bar, and baz?

Nick Ruiz
  • 1,405
  • 4
  • 18
  • 28
  • have you looked at https://stackoverflow.com/questions/327576/how-do-you-plot-bar-charts-in-gnuplot? – gregory Aug 17 '18 at 21:33
  • Yes, I did, thanks. I see how you can do two columns. Not sure how to expand that to extra columns. – Nick Ruiz Aug 18 '18 at 02:23

2 Answers2

0

Use awk inside of gnuplot. First, filter the table for only 'foo' rows, then generate an x cordinate with $NF var. Pass this edited file to plot (note the quotes around the awk command) with the with boxes option:

 plot "< gawk '$1 ~ /foo/ {print $NF, $2, $3 }' table.dat " using 1:3:xtic(2) with boxes

You can add multiple plots to a single diagram in gnuplot, so you could run the command above changing the filter to add "bar" and "baz" curves.

gregory
  • 10,969
  • 2
  • 30
  • 42
0

So far, I haven't found (or overlooked) comprehensive examples to plot a histogram out of a flattened lists with multiple (hierarchical) columns and unordered entries. So, this might be a starting point for further adaptions and optimizations.

Of course, alternatively, the data could be modified with whatever external tools (or gnuplot itself) to match gnuplot's required input format for histogram styles. Check help histograms.

Some comments about the code:

  • the keywords in the input data don't have to be ordered.
  • creates lists of unique keywords for column 1 and column 2. The order in the unique list (and consequently in the graph) will be in the order of first occurrence.
  • converts the keywords into numbers, i.e. indices in the corresponding keyword lists
  • the input data may contain multiple entries of the same key and sub-key, e.g. foo a 2, foo a 4, foo a 1. Since the option smooth freq is used they will be summed up to foo a 7.

Code:

Requires gnuplot 5.2.0, because of the use of keyentry. Code could be adapted for older versions.

### different ordered histograms
reset session

$Data <<EOD
# type    version count
bar a   3
baz b   2
bar c   1
baz a   0
foo a   1
foo b   2
foo c   3
bar b   2
baz c   2
EOD

# create a unique list of strings from a column
addToList(list,col) = list.( strstrt(list,'"'.strcol(col).'"') > 0 ? '' : ' "'.strcol(col).'"')
set table $Dummy
    keysA = keysB = ''
    plot  $Data u (keysA=addToList(keysA,1), keysB=addToList(keysB,2), '') w table
unset table

getIndex(keys,key)                      = (_idx=NaN, sum [_i=1:words(keys)] (word(keys,_i) eq key ? _idx=_i : 0), _idx )
myFilter(colD,colF,valF)                = strcol(colF) eq valF ? column(colD) : NaN 
myFilter2(colD,colF1,valF1,colF2,valF2) = strcol(colF1) eq valF1 && strcol(colF2) eq valF2 ? column(colD) : NaN

set style fill solid 0.3
set boxwidth 0.8 
set offsets 1,1,1,0
set datafile missing NaN
set key top center noautotitle
set grid x,y
gap = 1

set multiplot layout 2,2

    plot for [key in keysA] i=getIndex(keysA,key) $Data u (i):(myFilter(3,1,key)):xtic(key) \
         smooth freq w boxes lc i

    plot for [key in keysB] i=getIndex(keysB,key) $Data u (i):(myFilter(3,2,key)):xtic(key) \
         smooth freq w boxes lc i

    plot for [keyA in keysA] for [keyB in keysB] tmp=(i=getIndex(keysA,keyA), j=getIndex(keysB,keyB)) \
        $Data u ((i-1)*(words(keysB)+gap) + j):(myFilter2(3,1,keyA,2,keyB)): \
        xtic(keyB) smooth freq w boxes lc i, \
        for [keyA in keysA] keyentry w boxes lc getIndex(keysA,keyA) ti keyA

    plot for [keyB in keysB] for [keyA in keysA] tmp=(i=getIndex(keysB,keyB), j=getIndex(keysA,keyA)) \
        $Data u ((i-1)*(words(keysA)+gap)+ j):(myFilter2(3,1,keyA,2,keyB)): \
        xtic(keyA) smooth freq w boxes lc i, \
        for [keyB in keysB] keyentry w boxes lc getIndex(keysB,keyB) ti keyB

unset multiplot
### end of code

Result:

enter image description here

theozh
  • 22,244
  • 5
  • 28
  • 72