1

I try to calculate the correlation matrix of a DataFrame. But I'm so confused that should we replace all NA data with 0 or just remove them? In other words, will NA data affect the calculation of correlation?

Vicki
  • 15
  • 3

1 Answers1

0

If you're using the correlation values to classify data sets based on whether or not they correlate, sure, you could treat the NaN as 0. Perhaps a better way to interpret a NaN output is "not interpretable" or "not meaningful" whereas a significant correlation of 0 means "no correlation"

Also refer this:

Ailurophile
  • 2,552
  • 7
  • 21
  • 46