3

With the help of the question: Correlation heatmap, I have tried the following:

import pandas
import seaborn as sns 
dataframe = pandas.read_csv("training.csv", header=0,index_col=0)
for a in list(['output']):
    for b in list(dataframe.columns.values):
        corr.loc[a, b] = dataframe.corr().loc[a, b]
        print(b)
print(corr)
sns.heatmap(corr['output'])

I got the following error:

IndexError: Inconsistent shape between the condition and the input (got (8, 1) and (8,))

I do not want to have the all values correlation heatmap with all values. I only want to have the correlation of one column with respect to others.

Kindly, let me know what I am missing.

Jaffer Wilson
  • 7,029
  • 10
  • 62
  • 139
  • Often when you get something like this `(8, 1) and (8,)` you just need to reshape the `(8,)` to also be `(8,1)`. Lets say array `X` is `(n,)`. to get it to be `(n,1)` then either `X.reshape(-1,1)` or `X[:,np.newaxis]` – Dan Jan 11 '19 at 12:02
  • But have you just tried to replicate the examples from seaborns docs? https://seaborn.pydata.org/generated/seaborn.heatmap.html – Dan Jan 11 '19 at 12:03
  • Thank you for your reply. I need to check the example you have suggested. – Jaffer Wilson Jan 11 '19 at 12:17

2 Answers2

8

You are trying to build a heatmap from pd.Series - this does not work. pd.Series is a 1D object, while seaborn.heatmap() is commonly used for 2D data structures.

sns.heatmap(corr[['output']]) - will do the job

df = pd.DataFrame(data=[[1,2,3],[5,4,3],[5,4,12]],index=[0,1,2],columns=['A','B','C'])
df.corr().loc['A',:]

Out[13]:

A 1.0

B 1.0

C 0.5

Name: A, dtype: float64

sns.heatmap(df.corr().loc[['A'],:])

enter image description here

Sokolokki
  • 833
  • 1
  • 9
  • 19
2

In the line

sns.heatmap(corr['output'])

corr['output'] is a pd.Series. The documentation states

data : rectangular dataset

2D dataset that can be coerced into an ndarray. If a Pandas DataFrame is provided, the index/column information will be used to label the columns and rows.

You write

I do not want to have the all values correlation heatmap with all values. I only want to have the correlation of one column with respect to others.

In this case, why a heatmap? Your data is one dimensional. You might want to use a barchart, for example, using pd.DataFrame.corrwith:

dataframe.corrwith(dataframe['some_specific_column']).plot(kind='barh')
Ami Tavory
  • 74,578
  • 11
  • 141
  • 185
  • Thank you for your reply. The case is right now I am just using one column. But in future I may use 2 or 3, which is then I might need heatmap to judge the correlation. – Jaffer Wilson Jan 11 '19 at 12:20