1

I have a problem that has been discussed in different ways on other stackoverflow pages but I can't find any solution that can fix my strange problem. I have an excel file (.xlsx) that I read and store inside a dataframe. One column is called "Time" and has data in the following format "2018/10/13 14:29:00". In my python script, I need to convert this column from object type (that's the type I can see when querying python) to datetime and save this new format in a new column called "date_time_fmt".

This is literally my script where I tried all the possible solutions I found in stackoverflow:

try:
    df_accounts_followed['date_time_fmt'] = df_accounts_followed['time'].astype('datetime64[ns]')
except Exception as e:
    print("Error 1")
    exception_short_name = type(e).__name__
    print("Error Type:            {}".format(   exception_short_name))
    print("Error description:     {}".format(   e))            

    try:
        df_accounts_followed['time'] = df_accounts_followed['time'].astype('|S')
        name_dtype = df_accounts_followed['time'].dtype.name
        print("df_accounts_followed['time']   is now of type:  {}".format(name_dtype))
        df_accounts_followed['date_time_fmt'] = df_accounts_followed['time'].astype('datetime64[s]')
    except Exception as e:
        try:
            print("Error 2")
            exception_short_name = type(e).__name__
            print("Error Type:            {}".format(   exception_short_name))
            print("Error description:     {}".format(   e))   

            df_accounts_followed['date_time_fmt'] =   pd.to_datetime(df_accounts_followed['time'], format = '%Y/%m/%d %H:%M:%S')
        except Exception as e:                    
            print("Error 3")
            exception_short_name = type(e).__name__
            print("Error Type:            {}".format(   exception_short_name))
            print("Error description:     {}".format(   e))

            try:
                df_accounts_followed['date_time_fmt'] =  df_accounts_followed['time'].apply(lambda x: datetime.strptime(x,'%Y/%m/%d %H:%M:%S'))

            except Exception as e:                    
                print("Error 4")
                print("Error line num:      {}".format(sys.exc_info()[-1].tb_lineno))
                exception_short_name = type(e).__name__
                print("Error Type:            {}".format(   exception_short_name))
                print("Error description:     {}".format(   e))

                input("Enough !")

Now, when I run the script, this is what happens. The first try gives this error:

Error when filtering   df_excel_actions_report    dataframe.
Error Type:            ValueError
Error description:     Error parsing datetime string "2018/10/13 14:29:00" at position 4

This is the error I get after the second try:

Error Type:            ValueError
Error description:     Error parsing datetime string "2018/10/13 14:29:00" at position 4

This is the error I get after the third try:

Error Type:            TypeError
Error description:     <class 'bytes'> is not convertible to datetime

This is the error I get with the last try:

Error Type:            TypeError
Error description:     strptime() argument 1 must be str, not bytes

At this point my level of desperation is quite high. I hope somebody could help me. These are some of the pages I read: time.strptime() - argument 0 must be str, not bytes , Convert Pandas Column to DateTime II , pandas convert string columns to datetime, allowing missing but not invalid . Thank you

Angelo
  • 1,594
  • 5
  • 17
  • 50

1 Answers1

0

Not sure if this will help you, but I recently stumbled upon the <class 'bytes'> is not convertible to datetime error and this is how I resolved it.

My dataframe looked like this, notice that the index values are byte-strings:

Times   
b'2018-07-01_00:00:00'  0.006036
b'2018-07-01_01:00:00'  0.000000
b'2018-07-01_02:00:00'  0.000000
b'2018-07-01_03:00:00'  0.000000

The solution was to decode the Times index and then convert them to datetimes using a custom format.

pandas.to_datetime(df.index.astype('str'),
                   format="%Y-%m-%d_%H:%M:%S")

In your code above (try 3), this would look something like:

df_accounts_followed['date_time_fmt'] = pd.to_datetime(df_accounts_followed['time'].astype('str'),
                                                       format = '%Y/%m/%d %H:%M:%S')
Castrona
  • 493
  • 1
  • 3
  • 17