0

I am loading the table into pandas as follows:

import pandas as pd
df=pd.read_excel(file.xlsx,dtype=object)
A   B         C        D
1   2.0   1-02-1997   sam
Nan 2.3   3-08-1997   ram
2   nan   2-03-1997   pam

then I am replacing NaN Value

df=df.fillna(method='ffill')

then I am converting columns to int and float using

for c in df.columns:
   df[c]=pd.to_numeric(df[c],error='ignore')

but this will convert datetime columns also into the int/float. Is there any way to restrict datetime conversion to int/float?

k33da_the_bug
  • 812
  • 8
  • 16
Naruto
  • 139
  • 1
  • 10

3 Answers3

0

If the datetime columns are already of type datetime, you could only try to convert object columns.

Edit: first convert columns to datetime

# first convert objects to datetime if possible
df = df.apply(pd.to_datetime, errors='ignore')

# then convert remaining objects to numeric
columns = df.columns[df.dtypes=='object']
df[columns] = df[columns].apply(pd.to_numeric, errors='ignore')
Jason Cook
  • 1,236
  • 9
  • 12
0

You can use astype() for setting types. For specific columns (if you have dealt with Null values already and have less no of columns present) try

df['col'] = df['col'].astype('int')

or

df['col'] = df['col'].astype(int)
k33da_the_bug
  • 812
  • 8
  • 16
0

first here is a sample of your dataframe :

df = pd.read_csv(StringIO('''A   B         C        D
1   2.0   1-02-1997   sam
Nan 2.3   3-08-1997   ram
2   nan   2-03-1997   pam'''),sep=r'\s+')
df=df.fillna(method='ffill')
df
>>> A   B   C   D
0   1   2.0 1-02-1997   sam
1   Nan 2.3 3-08-1997   ram
2   2   2.3 2-03-1997   pam

you can just indicate the error with a try .. except statment:

def to_numeric(col):
    try:
        return pd.to_numeric(col,error='ignore')
    except:
        return col
    
for c in df.columns:
   df[c]= to_numeric(df[c])

list(map(str,df.dtypes))
>>>['object', 'float64', 'object', 'object']

note that column A cant turn into int and it stays 'object' so my suggestion is to turn evetithing to float if fails...

my implementation:

def to_numeric(col):
    try:
        return pd.to_numeric(col,error='ignore')
    except:
        try:
            return col.astype(float)
        except:
            return col
    
for c in df.columns:
   df[c]= to_numeric(df[c])

list(map(str,df.dtypes))
>>>['float64', 'float64', 'object', 'object']
df = df.ffill()
>>> A   B   C   D
0   1.0 2.0 1-02-1997   sam
1   1.0 2.3 3-08-1997   ram
2   2.0 2.3 2-03-1997   pam
 
adir abargil
  • 5,495
  • 3
  • 19
  • 29
  • you are not converting columns D into Datetime and also converting A into float instead of int. – Naruto Jan 01 '21 at 12:26
  • I am converting a to float because it cannot be int (it has nans... look at this https://stackoverflow.com/a/12716271/8893827 ) it seems that you are requiring more then what you asked for... so please specify what exacly your expected output and it will be easier to fit your needs.. although i think you should figure it out by yourself from all this various of awesome answers people gave to you – adir abargil Jan 01 '21 at 12:49