1

I have a dataframe of some float datatype column and 2 date type column.Sample

 I    J    P          V  
1.2   11  2011-12-03  2015-08-07 19:14:07
1.0   18  2011-12-03  2015-08-07 19:14:07
1.8   21  2011-12-03  2015-08-07 19:14:07

Now i want I and J col should be formatted yo .4f (after decimal 4 places).So 1.2 should reflect as 1.2000, In J col 18 should reflect as 18.0000. And date field should be formatted to YYYY-MM-DD for P and YYYY-MM-DD HH:MM:SS.

Yes it looks fine in dateframe..i formatted the I & J column by

df['I'] = ['%.4f'%x for x in df['I']]
df['J'] = ['%.4f'%x for x in df['J']]
 ## 1.2000, 11.0000  # fine

Then tried to give a csv output, But in that again 1.2000 reflected as 1.2 and 11.0000 as 11,,,, And date become 03/12/2011,07/08/2015 19:14:07 respectively for first row date value in dataframe.

Which is not fullfilling the requirement(4 decimal place and date as YYYY_MM_DD).

I tried doing

 df['P'] = [x.strftime('%Y-%m-%d') for x in df['P']]
 df['V'] = [x.strftime('%Y-%m-%d %H:%M:%S') for x in df['V']]
 #Then from doc and google tried doing 
 df.to_csv('D:/New.csv',date_format='%Y-%m-%d %H:%M:%S',index=False)  ### not worked
 df.to_csv('D:/changed_2.csv',float_format='string',date_format='%Y-%m-%d %H:%M:%S',index=False)   ###Also not Working

Please suggest how can i give a csv file with this requirement.

Satya
  • 5,470
  • 17
  • 47
  • 72

1 Answers1

1

I think you can first convert column J to float by astype, then convert column P to string by strftime and last add parameter float_format to to_csv:

print df
     I   J          P                   V
0  1.2  11 2011-12-03 2015-08-07 19:14:07
1  1.0  18 2011-12-03 2015-08-07 19:14:07
2  1.8  21 2011-12-03 2015-08-07 19:14:07

print df.dtypes
I           float64
J             int64
P    datetime64[ns]
V    datetime64[ns]
dtype: object


df['J'] = df['J'].astype(float)
df['P'] = df['P'].dt.strftime('%Y-%m-%d')
print df
     I     J           P                   V
0  1.2  11.0  2011-12-03 2015-08-07 19:14:07
1  1.0  18.0  2011-12-03 2015-08-07 19:14:07
2  1.8  21.0  2011-12-03 2015-08-07 19:14:07

print df.to_csv(index=False,float_format='%.4f')
I,J,P,V
1.2000,11.0000,2011-12-03,2015-08-07 19:14:07
1.0000,18.0000,2011-12-03,2015-08-07 19:14:07
1.8000,21.0000,2011-12-03,2015-08-07 19:14:07
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • @Jezrael- I have already tried that...in code('inside python terminal) it looks fine...But when i am opening the csv there i can't see this format,,...instead i am getting 1.2,11,03/12/2011,07/08/2015 19:14:07...... – Satya Apr 19 '16 at 10:42
  • Interesting. For me output in `csv` is same as in console. – jezrael Apr 19 '16 at 10:44
  • My concern is if i have made the I And J as string and P & V as string ...so it should reflect in CSV file as it is in Df...But it is not happening.. – Satya Apr 19 '16 at 10:44
  • Yes, I think it is good approach. If you try filter only first 10 rows, result is same? `df.head(10).to_csv('file.csv',index=False,float_format='%.4f')` ? – jezrael Apr 19 '16 at 10:45
  • Aren't you doubting there might be some configuration behind the csv opener tool which causing this.?? just asking..I am also not sure. Hoping Google can answer it – Satya Apr 19 '16 at 10:52
  • How does it works? Was problem found? I am curious. Thank you. Btw, can you accept my answers ([link1](http://stackoverflow.com/a/36692970/2901002), [link2](http://stackoverflow.com/a/36740794/2901002), this answer), if was helpful? Thank you. – jezrael Apr 20 '16 at 10:59
  • @jezrael-Yes,the problem was in my csv opener tool(WS OFFICE),,,as in notepad++ it was finely formatted. Thanks mate. – Satya Apr 20 '16 at 11:11