0

I´m new here and this is my first question, hope my Problem is described properly according to our rules in here...

I´ve got a data file (datafile.dat) which is used to create several plots (see below):

temp name1  name2
10   1000   1200
22   800    750
50   250    200
100  80     82
107  5      3

What I want to do is to create a plot with the values in the second and third column plotted with boxes. On the x-axis the names these values refer to shall be displayed. In Addition it shall be possible to give each of the boxes a specific colour. An additional Advantage would be that the solution can also be used in a Loop (because the original data file contains a lot more columns...). In the end I want the graph to look something like this: Desired Layout of the plot.

In order to get this I tried different things I found searching the Internet (see below). I am running gnuplot 5 on Windows with the following command file:

xticlabels If I try this e.g. for column 2 this doesn´t work:

plot 'datafile.dat' u 2:xticlabels(columnhead(2))

Using an external utility Didn´t work at all, failure message was produced

Stats Looks like a pretty good solution if I store the output in a variable. But I can´t get my code working (see below):

reset
set terminal postscript eps size 15 cm, 15 cm colour enhanced dashed "Times, 22"
set output "test.pdf"
stats 'datafile.dat' using 2
b = STATS_sum
plot 'datafile.dat' u 2:xticlabels(b) every ::1     
reset

What can I do to create the desired output from the data file above? I tried the Points mentioned above in many different combinations. Suggestion 1, Suggestion 2, Suggestion 3 are further Topic-related ideas to solve the Problem but I got none of these working. Can please anyone help me to get a solution? Any hints will be highly appreciated!!!

Thanks in advance!!!

Michael

EDIT: I found out that this question was already asked from someone else three years ago: Axis label and column header ...Is there maybe a solution today? Also: Question

Community
  • 1
  • 1
Michael
  • 3
  • 6
  • I'm not sure how you are handling multiple points. If you are plotting the second column, you will be plotting five points based on your example, yet your desired output has only one box for that column. What do you want to happen with the other 4? – Matthew Apr 08 '16 at 19:00
  • Hello Matthew! Thank you for your comment. Indeed, you´re right...that would cause a Problem. In the plot column I will specify (using every) that only one data point is plot. Do you understand what I mean? – Michael Apr 08 '16 at 21:07
  • Do you mean the `every ::1` command? That does not restrict to only one point - that means plot every line starting at line 1. If you only wanted line one, you could do `every ::1::1`. Is this what you will be using? Also, which version of gnuplot are you using? You might be able to accomplish this using a columnstacked histogram. – Matthew Apr 08 '16 at 21:38
  • Yeah, I mean the `every ::1::1` command... I am using Gnuplot Version 5.0 patchlevel 1. Thank you for your answer. I will immidiately have a look at it! – Michael Apr 09 '16 at 10:03
  • Was my answer helpful? Did it solve your problem? If so, please consider accepting it. You don't have to do so, but it awards both of us some reputation and marks the question as answered. You may also upvote particularly helpful answers. – Matthew Apr 18 '16 at 19:40
  • Sorry Matthew, had to finish my thesis first. I think your idea is a good one. I´ll try it out and report afterwards. I wanted to upvote as its not solved yet but I don´t have enough reputation yet... Tanks in advance Matthew! – Michael May 07 '16 at 13:28

1 Answers1

1

I can see two methods for doing this. The first is more automatic, but has the disadvantage of not being able to do the colors.

Method 1

Using only one datapoint for each column (as your comment suggests you will be doing), we can almost accomplish this using the columnstacked histogram style. At this point, I'm not sure how to get different colors, as the columnstacked style applies colors to the sections of the stacks.

Using, your example data, and the first line of data, we can do

set style data histogram            # we could do w histograms in the plot command instead
set style histogram columnstacked
set boxwidth 0.9                    # so the boxes don't touch
set style fill solid
set key autotitle columnhead        # first row contains series name

plot for[i=2:3] "datafile.dat" every ::0::0 u i 

where every ::0::0 means use the 0th (first) line of data only.

This produces

enter image description here

To plot columns 2 through 50, for example, just change the for[i=2:3] to for[i=2:50].

Method 2

We can do this by using the stats command to add the labels, and then do a standard plot command.

To set the tic marks, we can do

set xtics 1,1 format ""
do for[i=2:3] {
    stats "datafile.dat" every ::0::0 u (a=strcol(i),1) nooutput
    set xtics add (a i-1)
}

The first command here sets the xtics to occur every 1 unit starting at 1 but suppresses the labels (we will be setting our own labels).

We then loop over each column, reading the 0th line in the datafile with the stats command. When we read it, we store the columnheader in a variable a. We just return a 1 for the stats command to actually analyze. We actually don't care about the results of this command, we just need it to read the column headers. Finally, we use set xtics add to add this label as an xtic.

Next, we can do some necessary set up commands

set style fill solid
set boxwidth 0.9      # so the boxes don't touch
unset key
set yrange[0:*]       # by default, the smallest boxes may be cut off

Finally, we can plot with

plot for[i=2:3] "datafile.dat" every ::1::1 u (i-1):i w boxes

The result is

enter image description here

Again, the for loops can be changed to use any number of columns. X-ranges can be adjusted if desired, and linetype commands can be used in the plot command to set colors.


We use every ::0::0 because the set key autotitle command causes the first line with the column headers to be ignored (processed before the plot command). Thus the first (0th) line is the first line of actual data.

Note that here we use every ::1::1 because the 0th line is the column header line. Without the set key autotitle command, the first line is not automatically ignored.

Matthew
  • 7,440
  • 1
  • 24
  • 49
  • Tanks Matthew. Since I wanted to have different colors, I tried method 2 and it worked well. Thanks, your solution solves my problem. Anyway, do you think it is possible to specify the color of the bars? – Michael May 07 '16 at 15:06
  • @Michael Yes, any standard method of specifiing colors will work. By default, it uses line styles 1 and 2 (because it is plotting two bars). If we want to change that, we can, for instance use `plot for[i=2:3] "datafile.dat" every ::1::1 u (i-1):i w boxes lt (i+1)`, and it will use linetypes 3 (when i is 2) and 4. We can also define a function to determine color: `color(x) = (x==1)?(255<<16):255` and use it `plot for[i=2:3] "datafile.dat" every ::1::1 u (i-1):i w boxes lc rgb color(i-1)` where color returns red for 1 and blue otherwise. – Matthew May 07 '16 at 16:59
  • @Michael Or, as a last resort, we can even redefine the linestyles: `set lt 1 lc "yellow"` and `set lt 2 lc "orange"` and then the plot command in the answer will produce a yellow and orange box. – Matthew May 07 '16 at 17:01
  • Thanks. That worked fine! I got one more question (do I have to start a new thread?). Above you used `set yrange[0:*]`. If we do this we get the solution in the picture above. Is it possible to do sth like `set yrange[0:ymax]` and read out ymax before? Thanks for your help! – Michael May 09 '16 at 10:25
  • The reason I used the set command is that gnuplot only looks at the top of the boxes when it does its automatic range computations. I wanted to be sure that they started at 0. You can set it any way that you want otherwise. You can loop over the columns to find the maximum y value and set a range with that if you like. – Matthew May 09 '16 at 10:50
  • Thank you Matthew! Is there a command in order to find a maximum looping over all entließ? – Michael May 13 '16 at 07:45
  • @Michael You can write a loop to do that. For example, with the sample data, if I wanted to find the maximum over column 2 ("name1") and 3 ("name2"), I could do `max = 0; do for [i=2:3] {stats "datafile" u i nooutput; max = STATS_max>max?STATS_max:max}` This will initialize **max** to 0 and then loop over the columns. For each column, I will run stats and then if the max of that column is larger than **max** we will store the new value. At the end **max** will have the largest value of those two columns (1200). – Matthew May 13 '16 at 09:19
  • Sorry, I meant entries... What you´re suggesting works great! Tank you very much! – Michael May 13 '16 at 13:24