36

Assuming I have a file that looks like this (note the double newlines):

"p = 0.1"
1 1
3 3
4 1


"p = 0.2"
1 3
2 2
5 2

Is it possible to make Gnuplot plot these two datasets in one plot with the titles given on the first line of each dataset?

Johan Walles
  • 1,447
  • 2
  • 15
  • 23
gTcV
  • 2,446
  • 1
  • 15
  • 32

5 Answers5

40

It's definitely possible and your datafile is already the correct format. The functionality you're looking for is built into columnheader(N) which reads the data at the top of the N'th column and uses it as the plot title:

 plot 'test.dat' i 0 u 1:2 w lines title columnheader(1),\
      'test.dat' i 1 u 1:2 w lines title columnheader(1)

which can be condensed using iteration:

plot for [IDX=0:1] 'test.dat' i IDX u 1:2 w lines title columnheader(1)
mgilson
  • 300,191
  • 65
  • 633
  • 696
  • 1
    I did try using `columnheader(1)` for *my* datafile format (with the titles as comments), but it did not work. You are right, though, for the original poster's format, and I'm glad that there's such an easy solution! – andyras Oct 10 '12 at 15:36
  • @andyras -- It's because you commented out the title "data" and so gnuplot ignored it. As far as I can tell `#` lines in datafile are ignored by everything whereas lines which don't parse (e.g. non-numeric) are happily ignored for plotting, but they are still available for special functions like `columnhead`. – mgilson Oct 10 '12 at 15:38
  • For some reason using the first method suggested causes problems with "fairly large" files. I am trying to plot a 2 GB file, I have 32 GB of ram, and I am running out of memory because my 'test.dat' contains 50 data series, hence I have a 'plot' command with 50 plotting instructions following it. It appears that GNUPLOT is loading the 2GB file into memory 50 times, causing me to run out of memory. !? – FreelanceConsultant Nov 02 '15 at 22:45
  • @user3728501 -- That might be possible. I suppose my question back to you would be what are you doing plotting 2gb of data? You'll probably end up with multiple datapoints on top of each other. You should probably downsample your dataset for visualization at this point (or pre-process the stuff you really want to see into a separate file)... – mgilson Nov 02 '15 at 23:07
  • 1
    @mgilson Unfortunately this leads to loss of information - I am plotting stochastic data and so it is convenient if all points are plotted as it shows the spread of data over small periods of time. – FreelanceConsultant Nov 02 '15 at 23:16
  • @user3728501 -- How many records would you estimate would end up on your plot? A really great monitor has ~2500 pixels across it. If you plot more than 2500 records, you'll have more records than you have pixels to display them. Of course you can zoom and all that fun stuff, but that's why it's good to downsample for initial exploration and quick display. Then you can re-plot the full data only on the range that you really care about exploring deeper. – mgilson Nov 02 '15 at 23:20
  • 1
    @mgilson The data is stochastic - there is no way to know which points can be got rid of when down sampling – FreelanceConsultant Nov 03 '15 at 13:16
8

This is Bruce_Warrior's and Ciro Santilli's answers but without the intermediate stats:

# plot.gpi
datafile = ARG1
plot for [i=0:*] datafile index i using 1:2\
with lines title columnheader(1)

The for loop can iterate over all datasets in a file directly. It works in gnuplot 5.0.5 but I'm not sure when for acquired this capability. It is documented in the 5.0 manual but not the 4.6 manual.

Unless the line color should be determined by a third input column consumed by linecolor variable (per Bruce's answer), gnuplot will assign different colors and line styles automatically. In this specific case using 1:2 can also be omitted.

$ gnuplot --version
gnuplot 5.0 patchlevel 5
$ gnuplot --persist -c plot.gpi test.dat

Plot de

test.dat is

"p = 0.1"
1 1
3 3
4 1


"p = 0.2"
1 3
2 2
5 2
mkjeldsen
  • 2,053
  • 3
  • 23
  • 28
  • 1
    Great to learn that `*` in `for` can achieve various useful things much easier. However, there are probably other situations where we need the number of blocks elsewhere and would therefore still need to use `stats`. For instance, right now I need `STATS_blocks` to calculate steps through the palette to generate a smooth range of colours. – underscore_d Jan 28 '18 at 13:08
4

A solution based on answers given by andyras (answer 1 and answer 2) to automatize all the process is to use:

datafile = 'test.dat'
stats datafile
plot for [IDX=1:STATS_blocks] datafile index (IDX-1) u 1:2 w lines t\ 
columnheader(1) lc variable

With this, the script detects automatically the number of data blocks and it plots with different colors and with the corresponding title defined in the first line of each data block.

Community
  • 1
  • 1
Bruce_Warrior
  • 1,161
  • 2
  • 14
  • 24
4

gnuplot 5.1 (2016/08/28)

This is similar to https://stackoverflow.com/a/29495496/895245 but with some fixes for later versions.

https://stackoverflow.com/a/43819870/895245 taught me the for [i=0:*] syntax which dispenses stats and is therefore a bit nicer.

Script:

datafile = 'test.dat'
stats datafile nooutput
plot for [IDX=0:STATS_blocks-1] \
    datafile \
    index IDX \
    using 1:2 \
    with lines \
    title columnheader(1)

Test data:

a
1, 1
2, 2
3, 3


"b"
1, 1
2, 4
3, 9


"c, c"
1, 1
2, 8
3, 27

Output:

This works on gnuplot 2016/08/28 which will later become gnuplot 5.1, but not in gnuplot 5.0.3 (Ubuntu 16.04), because in 5.0.3 the stats command gives an error because the column headers are not valid data. And on 2016/08/28 it became just a warning.

I've asked how to remove the warning at: https://groups.google.com/forum/#!topic/comp.graphics.apps.gnuplot/Pi4aBE2PwZ8

Using comments like:

#a
1, 1
2, 2
3, 3

did not work in either version I've tested, it is just ignored.

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
1

With a slight modification of your data set (so that the titles are given as comments):

#"p = 0.1"
1 1
3 3
4 1


#"p = 0.2"
1 3
2 2
5 2

You can plot these two data sets as separate lines like this:

plot 'data.dat' i 0 t "p = 0.1", '' i 1 t "p = 0.2"

The index (i for short) option to the plot command tells gnuplot to plot the ith data set. I can't find a way to get gnuplot to get the titles automatically from the header which is why I specified them manually with the title (t for short) option.

andyras
  • 15,542
  • 6
  • 55
  • 77
  • Thanks! Getting Gnuplot to use the titles given in the file would have been the whole point however. But if this is really not possible, then I will have to do it the way you proposed :-( – gTcV Oct 10 '12 at 13:24
  • 2
    @andyras -- You can definitely get the titles automatically :-). The trick is `columnheader` (see my answer). – mgilson Oct 10 '12 at 15:23