this question can be viewed as extension to the thread below: (How can I get the average (mean) of selected columns). How do we impute the missing values ie., NA"s using the mean of the selected columns.
Asked
Active
Viewed 177 times
-3
-
4You posted a [similar question](http://stackoverflow.com/questions/38844402/mean-imputation-of-selected-columns-using-rowmeans/38844456#38844456) one hour back and when you got the solution, suddenly deleted the post. – akrun Aug 09 '16 at 09:20
-
i didn't understand the solution. i was not trying to create separate data frame. – Bindukumar Jampana Aug 09 '16 at 09:23
-
1In the solution, it was not creating any separate dataset. It just updates the column NA with the mean values. Also, posting similar questions (after deleting the previous post) is somewhat abusing the system. – akrun Aug 09 '16 at 09:24
-
could you give the solution again please.My fault I will not delete threads here on. – Bindukumar Jampana Aug 09 '16 at 09:26
-
Here is the solution `library(zoo); df1[4:8] <- lapply(df1[4:8], na.aggregate)` – akrun Aug 09 '16 at 09:29
-
an extension to the above question if i want to place median values to impute NA's How can that be acheived and what are some good resources to learn more about ZOO package – Bindukumar Jampana Aug 09 '16 at 09:54
-
The linked question is undeleted now. If @akrun's answer provides the requested solution I suggest that you accept that answer and delete this question here. – RHertel Aug 09 '16 at 09:55
-
i did not find any link on my dashboard stating undeleted – Bindukumar Jampana Aug 09 '16 at 10:05
-
Okay, I posted the solution – akrun Aug 09 '16 at 10:10
-
It's quite unclear what you are doing, but possibly you should use more advanced imputation techniques. You should have a look at the Amelia or mice package. – Roland Aug 09 '16 at 10:28
1 Answers
0
One option is na.aggregate
from zoo
to impute the missing values (NA) with the mean
value of that column. We loop through the selected columns of dataset (lapply(df1[4:8], .
), apply the function and then update the columns on the lhs of <-
library(zoo)
df1[4:8] <- lapply(df1[4:8], na.aggregate)
If we need the median
, use the FUN
as median
(by default it is mean
)
df1[4:8] <- lapply(df1[4:8], na.aggregate, FUN = median)

akrun
- 874,273
- 37
- 540
- 662