2

Is there a function in gnuplot which returns the number of columns in a csv file? I can't find anything in the docs, maybe someone can propose a custom made function for this?

sashkello
  • 17,306
  • 24
  • 81
  • 109
  • I don't think Gnuplot has any variable that tells you this. You could use the answer to [this question](http://stackoverflow.com/questions/9308602/gnuplot-columnstacked-histogram-line-row-count) as a workaround. – Thor Nov 14 '12 at 08:11

3 Answers3

5

As of gnuplot4.6, you can make a little hack script to do this. It is certainly not the most efficient, but it is pure gnuplot:

#script col_counter.gp
col_count=1
good_data=1
while (good_data){
   stats "$0" u (valid(col_count))
   if ( STATS_max ){
      col_count = col_count+1
   } else {
      col_count = col_count-1
      good_data = 0
   }
}

Now in your main script,

call "col_counter.gp" "my_datafile_name"
print col_count   #number of columns is stored in col_count.

This has some limitations -- It will choke if you have a column in the datafile that is completely non-numeric followed by more valid columns for example, but I think that it should work for many typical use cases.

print col_count

As a final note, you can use the environment variable GNUPLOT_LIB and then you don't even need to have col_counter.gp in the current directory.

mgilson
  • 300,191
  • 65
  • 633
  • 696
  • I borrowed your idea for my answer here: [accessing the nth datapoint in a datafile using gnuplot](http://stackoverflow.com/questions/13986596/accessing-the-nth-datapoint-in-a-datafile-using-gnuplot/17416567#17416567) – syockit Jul 02 '13 at 02:39
4

Assuming this is related to this question, and that the content of infile.csv is:

n,John Smith stats,Sam Williams stats,Joe Jackson stats
1,23.4,44.1,35.1 
2,32.1,33.5,38.5 
3,42.0,42.1,42.1 

You could do it like this:

plot.gp

nc = "`awk -F, 'NR == 1 { print NF; exit }' infile.csv`"
set key autotitle columnhead
set datafile separator ','
plot for [i=2:nc] "< sed -r '1 s/,([^ ]+)[^,]+/,\\1/g' infile.csv" using 1:i with lines

Note that the \1 needs escaping when used within " in Gnuplot.

Output:

Data file plot

Community
  • 1
  • 1
Thor
  • 45,082
  • 11
  • 119
  • 130
0

Here is an update and an alternative extended retro-workaround: (of course gnuplot-only)

Update: (gnuplot>=5.0.0, Jan 2015)

Since gnuplot 5.0.0, there is the variable STATS_columns which will tell you the number of columns of the first unommented row.

stats FILE u 0 nooutput
print STATS_columns

Extended retro-workaround: (gnuplot>=4.6.0, March 2012)

Some time ago, I learnt that a correct CSV file should have the same number of columns (i.e. commas) in all rows. So it should be sufficient to "count" the commas in the first uncommented row. That's apparently what gnuplot>=5.0.0 is doing more or less.

However, in case you have an "incorrect CSV" with varying columns and you are interested in the minimum and maximum number of columns, you can use the following script, assuming that there are no (doublequoted) strings having a comma inside. Note, row indices are 0-based.

Data: SO13373206.dat

11, 12, 13, 14, 15, 16, 17
21, 22, 23, 24, 25, 26, 27, 28
31, 32, 33, 34, 35, 36, 37, 38, 39
41, 42, 43, 44, 45, 46

Script:

### count number of columns (gnuplot>=4.6.0)
reset

FILE = "SO13373206.dat"

countCommas(s) = sum[i=1:strlen(s)] ( s[i:i] eq ',' ? 1 : 0)
set datafile separator "\t"         # in order to read a row as one string
stats FILE u (colCount=countCommas(strcol(1))+1,0) every ::0::0 nooutput
print sprintf("number of columns in first row: %d", colCount)

colMin = colMax = rMin = rMax = NaN
stats FILE u (c=countCommas(strcol(1))+1, \
              c<colMin || colMin!=colMin ? (colMin=c,rMin=$0) : 0, \
              c>colMax || colMax!=colMax ? (colMax=c,rMax=$0) : 0 ) nooutput
print sprintf("minimum %d columns in row %d",colMin, rMin)
print sprintf("maximum %d columns in row %d",colMax, rMax)

set datafile separator ","    # restore separator
# ... plot something
### end of script

Result:

number of columns in first row: 7
minimum 6 columns in row 3
maximum 9 columns in row 2
theozh
  • 22,244
  • 5
  • 28
  • 72