2

I have 2 datafiles with columnheads, similar to:

Datafile1:

A B  
1 8  
3 10 
5 9  
...

Datafile2:

A,C  
2,10  
4,15 
6,13  
...

Now I'd like to merge them into 1 table and then plot, like:

table:

A B C  
1 8  
3 10  
5 9  
2  10  
4  15  
6  13

I tried with the following code:

set table “C:/table.txt”
plot datafile1 u (stringcolumn(1)):(stringcolumn(2)) w table
plot datafile2 u (stringcolumn(1)):" ":(stringcolumn(2)) w table
unset table
plot “C:/table.txt” u 1:2 title columnhead, '' u 1:3 title columnhead

Remaining problem is that this code doesn't take the columnheads into the resulting table.

The reason I want to merge datafiles into 1 table is that they are logfiles with quite different formats. By combining several files in a table while making their date- and timecolumns uniform, I want to plot some or all columns from them in 1 graph.

Here are 2 samples from real datafiles:
(each of these logfiles may grow up to 1GB over time)

Datafile1:

Vacuum log  10/26/2019  6:20:07 PM
#   Date    Time    CCGp    IGPa    IGPcl   IGPco   IGPf    PIRbf   PIRll   PIRm    PIRpv   PPcl    HT
0   10/26/2019  6:20:07 PM  99.020  261.070 16.300  10.709  860.746 8.344   116.559e3   253.848 7.926   628.832e-3  0
1   10/26/2019  6:20:08 PM  101.305 261.613 16.908  10.012  862.001 8.399   116.535e3   256.132 7.726   677.712e-3  12
2   10/26/2019  6:20:09 PM  102.874 262.205 17.010  9.520   863.380 8.492   116.510e3   258.740 7.308   755.026e-3  102
...  

Datafile2:

log     10/26/2019 8:47:17 PM
PC name     52334810

Date, Time, V-Acc, I-Acc[uA], V-Sup, V-Ext, I-Ext[uA], V-Fil, I-Fil[A], V-L1A, I-L1A[uA], V-L2, I-L2[uA], CIVi, IGPPressure[Pa], S.F.Internal, IGP4[Pa], PIVi
26/10/19, 18:20:06.484, 1.60, 0.000000, -0.12, 1.83, 0.054932, 2.033, 0.000381, 0.61, 52.000485, 0.00, 0.001373, 12, 0.0000407, Disabled???, 6
26/10/19, 18:20:07.468, 1.77, 2.003000, -3.62, 1.83, 1.623844, 2.629, 0.000382, 0.71, 51.638814, 0.00, 0.641373, 9, 0.0000407, Disabled???, 6
26/10/19, 18:07:07.878, 1.84, 12.000800, -5.17, 1.83, 3.051953, 2.385, 0.000381, 0.67, 49.071190, 0.00, 1.903325, 11, 0.0000407, Disabled???, 5
...

Does anyone have an idea how to copy the data and columnheads as well into a table like in the example ?

Thanks

Tim
  • 21
  • 3
  • from your minimal example it is not apparent to me why you need to merge two files instead of simply plotting two files: `plot "datafile1" u 1:2 ti columnhead, "datafile2" u 1:2 ti columnhead`? – theozh Dec 27 '19 at 17:02
  • What is the column separator in datafile1? Fixed column width with several spaces or TAB (`\t`)? – theozh Dec 31 '19 at 09:40

1 Answers1

1

As I mentioned in the comments, from your minimal example it is not apparent why you need to merge two datafiles instead of plotting two datafiles. One reason I could think of would be in case you need to do some mathematical operations between columns. To my knowledge you can't do this easily with two columns in two different files.

Anyway, if you have to merge two datafiles you could do this also via external programs. But if you're asking for a gnuplot-only solution you could do it like below. Assumption is that your data is in datablocks already. See here: gnuplot: load datafile 1:1 into datablock. You simply "print" the datalines of the datablocks into a new datablock, but you have to remove the last character (i.e. newline). With this, you actually get 4 columns A,B,A,C. If you want to remove the extra A column this would be some extra effort.

Code:

### merge two datablocks (line by line)
reset session
set datafile separator ","

$Data1 <<EOD
A,B
1,3
2,6
3,5
EOD

$Data2 <<EOD
A,C
1,8
2,4
3,7
EOD

set print $Data3
    do for [i=1:|$Data1|] {
        print sprintf("%s,%s", $Data1[i][1:strlen($Data1[i])-1], $Data2[i][1:strlen($Data2[i])-1])
    }
set print

print $Data3
### end of code

Result:

A,B,A,C
1,3,1,8
2,6,2,4
3,5,3,7

Addition: (after OP giving more details)

OK, your task is actually a bit more complex. Of course, you could use externals software to rearrange the data, but you can also do something just with gnuplot.

What the code below basically does: you select column numbers myCol1 and myCol2 which will be written from $Data1 and $Data2, respectively, into a new datablock $Data3. The different date/time formats will be changed to a common format myTimeFmt3.

Some comments:

  1. the time format in datafile1 is probably the most stupid time format ever. Although in gnuplot there is the time specifier %p for AM and PM, but I haven't succeeded to convert the 12 hour format into 24 hour format. So, I defined my own formula. To illustrate that it works, I changed some of your original times to 12 am and 12 pm.

  2. datafile1 also contains the header in a line starting with #, which is the standard character for comment lines (which hence will be ignored). Change it to any character, e.g. @ which does not occur in your datafiles.

  3. plot the lines containing the headers into a dummy table and at the same time assign the desired header values to Header1 and Header2.

  4. since whitespace is assumed for datafile1, 0 10/26/2019 6:20:07 AM 99.020 are actually 5 columns. That's why strcol(myCol1+1) is used later in the plot command.

For your needs, skip the set $Data... EOD parts and exchange $Data1, $Data2, $Data3 with your filenames. Tested with gnuplot 5.2.6.

Code:

### merge columns from two files having different date/time formats
reset session

$Data1 <<EOD
Vacuum log  10/26/2019  6:20:07 PM
#   Date    Time    CCGp    IGPa    IGPcl   IGPco   IGPf    PIRbf   PIRll   PIRm    PIRpv   PPcl    HT
0   10/26/2019  6:20:07 AM  99.020  261.070 16.300  10.709  860.746 8.344   116.559e3   253.848 7.926   628.832e-3  0
1   10/26/2019 12:20:08 AM  101.305 261.613 16.908  10.012  862.001 8.399   116.535e3   256.132 7.726   677.712e-3  12
2   10/26/2019 12:20:09 PM  102.874 262.205 17.010  9.520   863.380 8.492   116.510e3   258.740 7.308   755.026e-3  102
EOD

$Data2 <<EOD
log     10/26/2019 8:47:17 PM
PC name     52334810

Date, Time, V-Acc, I-Acc[uA], V-Sup, V-Ext, I-Ext[uA], V-Fil, I-Fil[A], V-L1A, I-L1A[uA], V-L2, I-L2[uA], CIVi, IGPPressure[Pa], S.F.Internal, IGP4[Pa], PIVi
26/10/19, 18:20:06.484, 1.60, 0.000000, -0.12, 1.83, 0.054932, 2.033, 0.000381, 0.61, 52.000485, 0.00, 0.001373, 12, 0.0000407, Disabled???, 6
26/10/19, 18:20:07.468, 1.77, 2.003000, -3.62, 1.83, 1.623844, 2.629, 0.000382, 0.71, 51.638814, 0.00, 0.641373, 9, 0.0000407, Disabled???, 6
26/10/19, 18:07:07.878, 1.84, 12.000800, -5.17, 1.83, 3.051953, 2.385, 0.000381, 0.67, 49.071190, 0.00, 1.903325, 11, 0.0000407, Disabled???, 5
EOD


myTimeFmt1a = "%m/%d/%Y"               # 10/26/2019
myTimeFmt1b = "%H:%M:%S"               # 6:20:07   (12h am/pm)
myTimeFmt2  = "%d/%m/%y, %H:%M:%S"     # 26/10/19, 18:20:06.484
myTimeFmt3  = "%d.%m.%Y %H:%M:%S"      # 31.12.2019 23:59:59

# change 12h am/pm format to 24h format
myTime12to24(t,p) = t+12*3600*(floor(t/3600)<12 && p eq "PM" ? 1 : floor(t/3600)==12 && p eq "AM"  ? -1 : 0)

myCol1 = 4
myCol2 = 3
myColDate1 = 2
myColTime1 = 3
myColDate2 = 1
myColTime2 = 2

# extract Headers
set datafile commentschar "@"    # whatever character which does not occur in file
set table $Dummy
    set datafile separator whitespace
    plot $Data1 u (Header1=strcol(myCol1)) skip 1 every ::0::0 w table
    set datafile separator comma
    plot $Data2 u (Header2=strcol(myCol2)) skip 3 every ::0::0 w table
unset table

# write data to new table 
set table $Data3
    plot '+' u ('Date Time, '.Header1.', '.Header2) every ::0::0 w table

    set datafile separator whitespace
    plot $Data1 u (strftime(myTimeFmt3,(timecolumn(myColDate1,myTimeFmt1a)) + myTime12to24(timecolumn(myColTime1,myTimeFmt1b),strcol(myColTime1+1))).", ".strcol(myCol1+1).", NaN") skip 2 w table

    set datafile separator comma
    plot $Data2 u (strftime(myTimeFmt3,timecolumn(1,myTimeFmt2)).", NaN, ".strcol(myCol2)) skip 4 w table
unset table
print $Data3
### end of code

Result:

 Date Time, CCGp, V-Acc 
 26.10.2019 06:20:07, 99.020, NaN       
 26.10.2019 00:20:08, 101.305, NaN      
 26.10.2019 12:20:09, 102.874, NaN      
 26.10.2019 18:20:06, NaN, 1.60 
 26.10.2019 18:20:07, NaN, 1.77 
 26.10.2019 18:07:07, NaN, 1.84 
theozh
  • 22,244
  • 5
  • 28
  • 72
  • I'm testing your script for the different types of logfiles that I want to combine, and it works amazingly well with only small changes to be made. Thanks a lot ! One little question: In Datablock 2 the millisec precision is lost, even when I use the time specifier %H:%M:%.3S so: 18:20:07.001 18:20:07.010 18:20:07.100 all end up in: 18:20:07 Is there a solution to this (minor) problem ? – Tim Jan 03 '20 at 06:07
  • The column separator for Datafile1 is TAB. – Tim Jan 03 '20 at 06:15
  • you need to set the _output_ format accordingly, here: `myTimeFmt3 = "%d.%m.%Y %H:%M:%.3S"` and then it should be fine. – theozh Jan 05 '20 at 12:37