2

I have the following questions.

I have Excel files as follows:

enter image description here

When i read the file using df = pd.read_excel(file,dtype=str). the first row turned to 2003-02-14 00:00:00 while the rest are displayed as it is.

How do i prevent pd.read_excel() from converting its value into datetime or something else?

Thanks!

Bubble Hacker
  • 6,425
  • 1
  • 17
  • 24
  • 2
    The `pd.read_excel()` function's `parse_dates` argument defaults to false. It's likely that Excel itself is interpreting that value as a datetime (as Excel loves to do). If you can, add `.0000` to the end of that cell so that Excel stops parsing it as a datetime. – ddejohn Nov 29 '22 at 05:14
  • Similar questions had been already asked here a few times before. Check out this post suggesting to specify converters explicitly as an ultimate solution. https://stackoverflow.com/a/32591786/1328439 – Dima Chubarov Nov 29 '22 at 05:18
  • thanks @ddejohn. However, i am not able to modify the existing excel data. Is there any workaround? – InetVMart Indonesia Nov 29 '22 at 09:43
  • Thanks @DmitriChubarov. Tried the solution and it still doesn't work. – InetVMart Indonesia Nov 29 '22 at 09:44

1 Answers1

0

As @ddejohn correctly said it in the comments, the behavior you face is actually coming from Excel, automatically converting the data to date. Thus pandas will have to deal with that data AS date, and treat it later to get the correct format back as you expect, as like you say you cannot modify the input Excel file.

Here is a short script to make it work as you expect:

import pandas as pd

def rev(x: str) -> str:
    '''
    converts '2003-02-14 00:00:00' to '14.02.03'
    '''

    hours = '00:00:00'
    if not hours in x:
        return x
    y = x.split()[0]
    y = y.split('-')
    return '.'.join([i[-2:] for i in y[::-1]])

file = r'C:\your\folder\path\Classeur1.xlsx'
df = pd.read_excel(file, dtype=str)

df['column'] = df['column'].apply(rev)

Replace df['column'] by your actual column name. You then get the desired format in your dataframe.