0

Have multiple tables need to process. One column of each table is date related but date format is changing, like 5/26, 05/26, 05/26/2020,05262020,5262020 I used

  df[date] = df[date].apply(dateutil.parser.parse, dayfirst=dayfirst,
                                                         yearfirst=yearfirst)

It used to works just fine, but recently some tables in the date column might have strings like"unknown" or "missing" or other strings. Then I got an error it broke the process.

 "ValueError: Unknown string format"

How to handle this to exclude the rows I got

"ValueError: Unknown string format"

Thanks.

Zesty Dragon
  • 551
  • 3
  • 18
newleaf
  • 2,257
  • 8
  • 32
  • 52

1 Answers1

0

Figure out a way to handle this, use regular expression first to exclude those rows then apply.

  df=df[df["date"].str.contains(re.compile('\d+'))]
  df[date] = df[date].apply(dateutil.parser.parse, dayfirst=dayfirst,
                                                     yearfirst=yearfirst)
newleaf
  • 2,257
  • 8
  • 32
  • 52