In a pandas dataframe, a column with dtype = object can, in fact, contain items of mixed types, eg integers and strings.
In this example, column a is dtype object, but the first item is string while all the others are int:
import numpy as np, pandas as pd
df=pd.DataFrame()
df['a']=np.arange(0,9)
df.iloc[0,0]='test'
print(df.dtypes)
print(type(df.iloc[0,0]))
print(type(df.iloc[1,0]))
My question is: is there a quick way to identify which columns with dtype=object contain, in fact, mixed types like above? Since pandas does not have a dtype = str, this is not immediately apparent.
However, I have had situations where, importing a large csv file into pandas, I would get a warning like:
sys:1: DtypeWarning: Columns (15,16) have mixed types. Specify dtype option on import or set low_memory=False
Is there an easy way to replicate that and explicitly list the columns with mixed types? Or do I manually have to go through them one by one, see if I can convert them to string, etc?
The background is that I am trying to export a dataframe to a Microsoft SQL Server using DataFrame.to_sql and SQLAlchemy. I get an
OverflowError: int too big to convert
but my dataframe does not contain columns with dtype int - only object and float64. I'm guessing this is because one of the object columns must have both strings and integers.