1

I have a pandas.DataFrame of size [200 rows x 200 columns]. Although all columns are of type int64, the type of DataFrame seems to be object. Why is that?

d.shape
Out[70]: (200, 200)

d.dtypes
Out[66]: 
      variable
edge  0           int64
      1           int64
      2           int64
      3           int64
      4           int64
                  ...
      195         int64
      196         int64
      197         int64
      198         int64
      199         int64
Length: 200, dtype: object

(d.dtypes == "int64").all()
Out[68]: True

The same situation can be observed in below toy example,

a = pd.DataFrame(np.random.randint(0, 2, size=(4,4)))

a
Out[91]: 
   0  1  2  3
0  0  1  0  0
1  0  0  1  0
2  0  1  0  1
3  1  1  1  1

a.dtypes
Out[92]: 
0    int64
1    int64
2    int64
3    int64
dtype: object

The answer of Strings in a DataFrame, but dtype is object says:

Every element in a ndarray must has the same size in byte. For int64 and float64, they are 8 bytes. But for strings, the length of the string is not fixed. So instead of save the bytes of strings in the ndarray directly, Pandas use object ndarray, which save pointers to objects, because of this the dtype of this kind ndarray is object.

But here, all columns are all int64. I also couldn't find a way to convert that object to int64 or should I?

ibilgen
  • 460
  • 1
  • 7
  • 15
  • if that truly represents the type of the DataFrame, that does make sense. the DataFrame itself is an object. but it's definitely a little confusing. – acushner Jan 11 '21 at 13:53
  • as I understand, dtype always becomes ´object´ for pandas.DataFrame, even all columns are of same type. It gives true dtype only for a single column which is pandas.Series. – ibilgen Jan 11 '21 at 15:11

0 Answers0