Create a dictionary of all unique keys in a column and store correlation co-efficients of other columns as associated values

Question

There is a dataset with three columns:

Col 1 : Name_of_Village
Col 2: Average_monthly_savings
Col 3: networth_in_dollars

So, I want to create a dictionary "Vill_corr" where the key values are the name of the villages and the associated values are the correlation co-effient between Col2 & Col3 using Pandas.

I am aware of methods of calculating the correlation co-efficients, but not sure how to store it against each Village name key,

corr = df["Col2"].corr(df["Col3"])

Please help.

What do you need help with? Do you know how to get the correlation coefficients? Do you know how to convert Pandas data structures to dict? Please [edit] to clarify. For more tips, see [ask], [mre], and [reproducible pandas examples](/q/20109391/4518341). — wjandrea, Jan 26 '23 at 04:07
Oops, I thought you mentioned Pandas but apparently not. It'd help to mention what library(s) you're using. — wjandrea, Jan 26 '23 at 04:08
Sorry, was in a hurry to post. Added some additional details. Please see @wjandrea — Pragyaditya Das, Jan 26 '23 at 05:00

score 1 · Accepted Answer · answered Jan 26 '23 at 05:36

Use groupby.apply and Series.corr:

np.random.seed(0)

df = pd.DataFrame({'Name_of_Village': np.random.choice(list('ABCD'), size=100),
                   'Average_monthly_savings': np.random.randint(0, 1000, size=100),
                   'networth_in_dollars': np.random.randint(0, 1000, size=100),
                  })

out = (df.groupby('Name_of_Village')
         .apply(lambda g: g['Average_monthly_savings'].corr(g['networth_in_dollars']))
      )

Output:

Name_of_Village
A   -0.081200
B   -0.020895
C    0.208151
D   -0.010569
dtype: float64

As dictionary:

out.to_dict()

Output:

{'A': -0.08120016678846673,
 'B': -0.020894973553868202,
 'C': 0.20815112481676484,
 'D': -0.010569152488799725}

Thank you. Just a question, can I add `to_dict()` to the first line itself? — Pragyaditya Das, Jan 26 '23 at 06:30

Create a dictionary of all unique keys in a column and store correlation co-efficients of other columns as associated values

1 Answers1