I have a pandas dataframe like below
import pandas as pd
import numpy as np
data = {'A': ['A', 'B', 'C', 'D', 'E'], 'B': [1, 2, 3, 4, 5], 'C': [3, 4, 5, 6, 7],
'D': ['N', 'G', 'S', 'P', 'Q']}
df = pd.DataFrame(data)
print(df)
A B C D
0 A 1 3 N
1 B 2 4 G
2 C 3 5 S
3 D 4 6 P
4 E 5 7 Q
Now when I check the data types of pandas dataframe, I see that two columns are of int
data type and the other two columns are of object
data type
print(df.dtypes)
A object
B int64
C int64
D object
dtype: object
When I convert the entire dataframe to a numpy array using the below code and then check the data type of each column it is of object data type.
X = np.array(df)
print(X.dtype)
object
print(X[:,1].dtype)
object
print(X[:,2].dtype)
object
print(X[:,3].dtype)
object
print(X[:,4].dtype)
object
The question is, is there a way to keep the data types of NumPy arrays same as that of the original pandas data frame?