0

If in a dataset we have missing values in both categorical and continuous variables, how can I deal with them by replacing with mode for the categorical variable and mean for the continuous variable?

L. Bakker
  • 147
  • 1
  • 13

2 Answers2

0

When the missing data are missing at random, you could impute the missing values using multiple imputation.

For more information about multiple imputation, I would recommend the book Applied Missing Data by C.K. Enders (2010). It also has a great companion website.

For multiple imputation in R you could use the mice package. Here is the link to the package on CRAN, the link to the documentation, and the link to the article in the Journal of Statistical Software.

There are other packages for multiple imputation.

L. Bakker
  • 147
  • 1
  • 13
0

You can try to use either fillna() or interpolate()

For more details about these two please refer my answer to this question in StackOverflow. link is: Missing values in Time Series in python

Yogesh Awdhut Gadade
  • 2,498
  • 24
  • 19