25

I want to see the datatype of all columns stored in my dataframe without iterating over them. What is the way?

Rusty
  • 4,138
  • 3
  • 37
  • 45

2 Answers2

36

10 min to pandas has nice example for DataFrame.dtypes:

df2 = pd.DataFrame({ 
    'A' : 1.,
    'B' : pd.Timestamp('20130102'),
    'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
    'D' : np.array([3] * 4,dtype='int32'),
    'E' : pd.Categorical(["test","train","test","train"]),
    'F' : 'foo' })

print (df2)
     A          B    C  D      E    F
0  1.0 2013-01-02  1.0  3   test  foo
1  1.0 2013-01-02  1.0  3  train  foo
2  1.0 2013-01-02  1.0  3   test  foo
3  1.0 2013-01-02  1.0  3  train  foo

print (df2.dtypes)
A           float64
B    datetime64[ns]
C           float32
D             int32
E          category
F            object
dtype: object

But with dtypes=object it is a bit complicated (generally, obviously it is string):

Sample:

df = pd.DataFrame({'strings':['a','d','f'],
                   'dicts':[{'a':4}, {'c':8}, {'e':9}],
                   'lists':[[4,8],[7,8],[3]],
                   'tuples':[(4,8),(7,8),(3,)],
                   'sets':[set([1,8]), set([7,3]), set([0,1])] })

print (df)
      dicts   lists    sets strings  tuples
0  {'a': 4}  [4, 8]  {8, 1}       a  (4, 8)
1  {'c': 8}  [7, 8]  {3, 7}       d  (7, 8)
2  {'e': 9}     [3]  {0, 1}       f    (3,)

All values have same dtypes:

print (df.dtypes)
dicts      object
lists      object
sets       object
strings    object
tuples     object
dtype: object

But type is different, if need check it by loop:

for col in df:
    print (df[col].apply(type))

0    <class 'dict'>
1    <class 'dict'>
2    <class 'dict'>
Name: dicts, dtype: object
0    <class 'list'>
1    <class 'list'>
2    <class 'list'>
Name: lists, dtype: object
0    <class 'set'>
1    <class 'set'>
2    <class 'set'>
Name: sets, dtype: object
0    <class 'str'>
1    <class 'str'>
2    <class 'str'>
Name: strings, dtype: object
0    <class 'tuple'>
1    <class 'tuple'>
2    <class 'tuple'>
Name: tuples, dtype: object

Or first value of columns with iat:

print (type(df['strings'].iat[0]))
<class 'str'>

print (type(df['dicts'].iat[0]))
<class 'dict'>

print (type(df['lists'].iat[0]))
<class 'list'>

print (type(df['tuples'].iat[0]))
<class 'tuple'>

print (type(df['sets'].iat[0]))
<class 'set'>
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
5

Use the DataFrame.info() method

>>> df.info()
RangeIndex: 5 entries, 0 to 4
Data columns (total 3 columns):
 #   Column     Non-Null Count  Dtype
---  ------     --------------  -----
 0   int_col    5 non-null      int64
 1   text_col   5 non-null      object
 2   float_col  5 non-null      float64
dtypes: float64(1), int64(1), object(1)
memory usage: 248.0+ bytes

Docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html

Fuji
  • 28,214
  • 2
  • 27
  • 29
  • This answer using df.info() is a better answer than the other df.dtypes answer as the dtypes attribute does not show the index dtype – Fuji Oct 20 '20 at 03:23
  • My issue with this answer is that if your df has a large number, say >150 columns, then it doesn't actually display in a notebook. Whereas df.dtypes does more readily using pd.set_option('display.max_info_rows', None) – Josh Knechtel May 11 '22 at 12:11