0

I have a pandas dataframe like below


import pandas as pd
import numpy as np

data = {'A': ['A', 'B', 'C', 'D', 'E'], 'B': [1, 2, 3, 4, 5], 'C': [3, 4, 5, 6, 7],
'D': ['N', 'G', 'S', 'P', 'Q']}
df = pd.DataFrame(data)
print(df)

   A  B  C  D
0  A  1  3  N
1  B  2  4  G
2  C  3  5  S
3  D  4  6  P
4  E  5  7  Q

Now when I check the data types of pandas dataframe, I see that two columns are of int data type and the other two columns are of object data type


print(df.dtypes)

A    object
B     int64
C     int64
D    object
dtype: object

When I convert the entire dataframe to a numpy array using the below code and then check the data type of each column it is of object data type.

X = np.array(df)

print(X.dtype)
object

print(X[:,1].dtype)
object

print(X[:,2].dtype)
object

print(X[:,3].dtype)
object

print(X[:,4].dtype)
object

The question is, is there a way to keep the data types of NumPy arrays same as that of the original pandas data frame?

user3046211
  • 466
  • 2
  • 13
  • if you have one numpy array that contains different data types then it will give object data type for whole array. Only way would be splitting them. – Shaig Hamzaliyev Dec 20 '22 at 14:22

0 Answers0