0

Hie, i have 3 csv files like shown below

datetime, forecast 2016-02-02 00:00:00, 23.34 2016-02-02 00:10:00, 29.23

timestamp, forecast, v1, v2 2016-02-02 00:00:00, 68.56, 012, .23 2016-02-02 00:10:00, 23.24, .25, .32

timestamp, forecast[ma], v1 2016-02-02 00:00:00, 56.32, 32 2016-02-02 00:10:00, 25.21, 56

i want my output to have

Time, Forecast, forecast1, forecast2 2016-02-02 00:00:00, 23.34, 68.56, 56.32 2016-02-02 00:10:00, 29.23, 23.24, 25.21

i have created codes to combine these file in xlsx with python. now that i am planning to process these files further with shell i want this files to be in csv.

i tried codes like.

join -j 2 -o 1.1,1.2,2.2 <(sort -k2 $path_DMS/$file_name) <(sort -k2 $path_ISRO/$file_name)

thanks

bala
  • 15
  • 4

1 Answers1

1

Could you please try following(this should work in most of the awks).

awk '
BEGIN{
  FS=OFS=", "
  print "Time, Forecast, forecast1, forecast2"
}
FNR==1{
  ++count
  next
}
count==1{
  a[$1]=$2
  next
}
count==2{
  a[$1]=a[$1] OFS $2
  next
}
count==3{
  print $1,a[$1],$2
}'  file1.csv file2.csv file3.csv

Output will be as follows.

Time, Forecast, forecast1, forecast2
2016-02-02 00:00:00, 23.34, 68.56, 56.32
2016-02-02 00:10:00, 29.23, 23.24, 25.21

Explanation: Adding detailed explanation for above code now.

awk '                                                ##Starting awk program here.
BEGIN{                                               ##Mentioning BEGIN section of awk which will execute before Input_file(s) getting read.
  FS=OFS=", "                                        ##Setting FS and OFS as ", " read man awk for FS and OFS too.
  print "Time, Forecast, forecast1, forecast2"       ##Printing headers for output.
}                                                    ##Closing BEGIN section here.
FNR==1{                                              ##Checking condition if this is first line of all Input_file(s).
  ++count                                            ##Increment variable count with 1 here.
  next                                               ##next will skip all further statements from here.
}                                                    ##Closing FNR==1 BLOCK here.
count==1{                                            ##Checking if count==1 then do following.
  a[$1]=$2                                           ##Creating an array a whose index $1 and value is $2.
  next                                               ##next will skip all further statements.
}                                                    ##Closing count==1 BLOCK here.
count==2{                                            ##Checking condition if count==2 then do following.
  a[$1]=a[$1] OFS $2                                 ##Concatenate value of a[$1] to its previous value which it got from file1.csv
  next                                               ##next will skip all further statements from here.
}                                                    ##Closing count==2 BLOCK here.
count==3{                                            ##Checking condition if count==3 then do following.
  print $1,a[$1],$2                                  ##Printing first field, a[$1] value  and $2 of current line for file3.csv
}'  file1.csv file2.csv file3.csv                    ##Mentioning all Input_file(s) names here.
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93