0

There are two tables with some values.

Table 1

   A  B  C
1           
2           
3           

Table 2

   D  E  F
1           
2           
3           

where 1,2,3 are rows

I want to find co relation between these two tables in python

Resultant Correlation Table

   D  E  F
A           
B           
C
shahaf
  • 4,750
  • 2
  • 29
  • 32
  • can give an example to your data, what's the meaning of correlation between `A` to `F` with respect to the column data? – shahaf Oct 14 '19 at 11:15

1 Answers1

0

Generating a correlation matrix between two sets of columns

You could use the pandas DataFrame.corrwith method.

For example:

df_1 = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]],columns=['A','B','C'])
df_2 = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]],columns=['D','E','F'])
corr_matrix = df_1.corrwith(other=df_2, axis=1) # You're using columns hence axis=1

EDIT: I'm sorry I misread the documentation, which is why the solution isn't working for you. I looked into it a bit more and this solution will work (despite being slightly costly). In fact, this post is a duplicate.

Applied to the above problem though, you get:

pd.concat([df_1, df_2], axis=1, keys=['df_1', 'df_2']).corr().loc['df_2', df_1']

Note that you're creating a multi-level dataframe, and then performing correlations between all the columns of df_1 with itself and the columns of df_2, and the df_2 with itself and all the columns of df_1, and then subsetting only the columns of df_2 with df_1 (which is what you wanted originally). This is costly and probably won't scale very well, but if you have two small DataFrames this should work.

Generating a heatmap from a correlation matrix

There are millions of medium posts with simple snippets how to do this. One of the easier ways would be to use seaborn:

import seaborn as sns
sns.heatmap(corr_matrix)

If you need a more specific example try here

ec2604
  • 501
  • 3
  • 11