I'm assuming in the following code, iris is a bunch object specifically made for sklearn/datasets.
# import load_iris function from datasets module
from sklearn.datasets import load_iris
# save "bunch" object containing iris dataset and its attributes
iris = load_iris()
When I'm trying to understand what type of object is it, it says bunch object.
type(iris)
Out[4]:
sklearn.utils.Bunch
Now, if I need to use corr() method for computing standard correlation between every pair of attributes, that needs to work on dataframe, not on bunch object.
How do I do that? Can I perform it on iris.data? I know it is an array. Not dataframe.
# check the types of the features
print(type(iris.data))
Out[5]:
<class 'numpy.ndarray'>
Now, if I had used the built-in dataset of seaborne or from the actual data source, it would not have this issue. Here iris.corr() is working perfectly. Yes, here iris is dataframe.
iris = sns.load_dataset("iris")
type(iris)
Out[7]:
pandas.core.frame.DataFrame
iris.corr()
Out[8]:
sepal_length sepal_width petal_length petal_width
sepal_length 1.000000 -0.117570 0.871754 0.817941
sepal_width -0.117570 1.000000 -0.428440 -0.366126
petal_length 0.871754 -0.428440 1.000000 0.962865
petal_width 0.817941 -0.366126 0.962865 1.000000
How do I run corr() in previous example? Using sklearn bunch object? How do I convert sklearn bunch object to dataframe? Or converting iris.data ndarray to dataframe?