0

I have three csv files.

One with tittles:

names1=['Date','Conc','Flow','SZ','SB','RZ','RB','Fraction','Attenuation','Conc_less_-200_flag','Conc_greater_500_flag','Wind Speed','Wind Direction','Wind_direction_Flag','Wind_Speed_Less_than_4','Middle','Wind_Speed_Greater_than_10','Multiple conditions']

First rows:

Date,Conc,Flow,SZ,SB,RZ,RB,Fraction,Attenuation,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions
2004-02-27 00:00:00,,,,,,,,,,,6.524999999999999,177.75,0.0,0.0,1.0,0.0,0.0
2004-02-27 01:00:00,,,,,,,,,,,6.991666666666667,197.83333333333334,0.0,0.0,1.0,0.0,0.0

The other two:

names2=['Date','Chanel0','Chanel1','Chanel2','Chanel3','Chanel4','Chanel5','Chanel6','Chanel7','Conc_less_-200_flag','Conc_greater_500_flag','Wind Speed','Wind Direction','Wind_direction_Flag','Wind_Speed_Less_than_4','Middle','Wind_Speed_Greater_than_10','Multiple conditions']

First rows:

Date,Chanel0,Chanel1,Chanel2,Chanel3,Chanel4,Chanel5,Chanel6,Chanel7,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions
2012-01-23 08:00:00,-2402.3575757575754,-2418.8121212121237,-2423.983863636366,-2422.913745454546,-2423.983863636366,-2422.814151515151,-2423.242424242424,-2422.4842121212123,1.0,1.0,,,,,,,
2012-01-23 09:00:00,6.5666666666666655,6.8849999999999945,0.02130000000000001,1.4343266666666665,0.02130000000000001,1.5671516666666663,1.0,2.085166666666667,1.0,1.0,,,,,,,

I want the output to be a csv file with header:

['Date','Conc','Flow','SZ','SB','RZ','RB','Fraction','Attenuation','Chanel0','Chanel1','Chanel2','Chanel3','Chanel4','Chanel5','Chanel6','Chanel7','Conc_greater_500_flag','Wind Speed','Wind Direction','Wind_direction_Flag','Wind_Speed_Less_than_4','Middle','Wind_Speed_Greater_than_10','Multiple conditions']

And so obviously: the contributing part from the second two files will have blanks (or better Nan's/0's for the flow, sz columns etc. And the first file will have them at the channel0-7 columns

Note the Date is the index col.

I tried df_merged=pd.concat(df1,df2,df3) but that seems to overlap the headers.

Also tried:

df_merged=pd.concat([df1,df2,df3],axis=1)

But that turns the cs output: into

,Conc,Flow,SZ,SB,RZ,RB,Fraction,Attenuation,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions,Chanel0,Chanel1,Chanel2,Chanel3,Chanel4,Chanel5,Chanel6,Chanel7,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions,Chanel0,Chanel1,Chanel2,Chanel3,Chanel4,Chanel5,Chanel6,Chanel7,Conc_less_-200_flag,Conc_greater_500_flag,Wind Speed,Wind Direction,Wind_direction_Flag,Wind_Speed_Less_than_4,Middle,Wind_Speed_Greater_than_10,Multiple conditions
2004-02-27 00:00:00,,,,,,,,,,,6.524999999999999,177.75,0.0,0.0,1.0,0.0,0.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,

Which is close, but it has the extra columns on the end And I don't think overlaps the common colomns

Rich
  • 3,781
  • 5
  • 34
  • 56
SLE
  • 65
  • 1
  • 17
  • Try `df = pd.merge(df1,df2,df3); df.to_csv('New.csv')` – Serenity Jul 01 '16 at 10:43
  • df = pd.merge(df1,df2,df3) gives: ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). – SLE Jul 01 '16 at 10:44

1 Answers1

0

This is idea. First read two files into lists then set and union two list

You can checkout this answer How to get the union of two lists using list comprehension?

Community
  • 1
  • 1
Mithat Konuk
  • 447
  • 5
  • 19
  • I'm using this to load them in. df1 = pd.read_csv('Combined04.csv', index_col=0, names=names1, parse_dates=True) What do you mean by turn the file into a list? That answer works great for the headers... but when tried it is lost all the data – SLE Jul 01 '16 at 10:51