-1

Here is a screenshot of my error.

The Trans_Imp_Date values are failing to be converted to pandas datetime format (yyyy-mm-dd). However the process is working fine for the other two columns.

I want it so that if the convert-to-datetime process fails for a specific row, pandas ignores that specific row and output the input without conversion (as per the errors = 'ignore' flag), and then continue converting the rest of the rows. How can I make that happen?

shadowtalker
  • 12,529
  • 3
  • 53
  • 96
Falc
  • 307
  • 2
  • 14
  • As far as I know, you can't get Pandas to ignore bad cells on failure. You need to check for them and remove or replace them. – shadowtalker Oct 03 '18 at 12:46
  • Ok thanks I'll try and work around it then. I thought errors = ignore was supposed to ignore bad cells: errors : {‘ignore’, ‘raise’, ‘coerce’}, default ‘raise’ If ‘raise’, then invalid parsing will raise an exception If ‘coerce’, then invalid parsing will be set as NaT If ‘ignore’, then invalid parsing will return the input – Falc Oct 03 '18 at 12:58

1 Answers1

0

Use copy.

If you modify values in df_p by function to_datetime later you will find that the modifications do not propagate back to the original data (p_junjul_trans_orig), and that Pandas does warning.

cols = ['Imp_Trans_Date','Trans_Imp_Date','Imposition_Date_of_Hearing']
df_p = p_junjul_trans_orig[cols].copy()

EDIT:

It seems some problem with data, e.g. traling whitespaces, what is possible check by:

print (df_p['Trans_Imp_Date'].head().tolist())
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • At this stage I'm not trying to modify the original data (p_junjul_trans_orig), I would like to keep that as it is. I want to modify the column 'Trans_Imp_Date' in 'df_p'. – Falc Oct 03 '18 at 12:24
  • @Falc - hmmm, so it failed only for one column? And if comment it, no error? There should be problem with [chained assignments](https://stackoverflow.com/a/20644369/2901002), but I guess it should raise warning for each `to_datetime` – jezrael Oct 03 '18 at 12:32
  • Yep its failing for that one column, all the output I have is included in the screenshot. – Falc Oct 03 '18 at 12:45
  • @Falc - I now understand, check edited answer. Btw, why dont use `errors='coerce'` parameter for convert not parseable values to `NaT` ? – jezrael Oct 03 '18 at 12:49
  • print (df_p['Trans_Imp_Date'].head().tolist()) gives the output: ['25/06/2018', '25/06/2018', '20/06/2018', '20/06/2018', '20/06/2018']. I don't want to use coerce because then I lose the value of what was there before. – Falc Oct 03 '18 at 12:53
  • @Falc - It looks nice, are data confidental? If not, is possible share `df_p.to_csv(file)` ? E.g. by gdocs, dropbox, similar? Or sent it to my email from my profile? – jezrael Oct 03 '18 at 12:55
  • Sorry its confidential. I think coerce is a good option though, I can use that to pick up cells which are broken. thanks for your helpT – Falc Oct 03 '18 at 13:03