1

I'm trying to impute the missing values with mean in a column in the dataset. Using impute function available in Hmisc package

Have tried running several set of codes. In past I've used the piece of code & on same dataset, however, now it's not running.

impute(crime$average.ed,mean)
crime$average.ed<-as.numeric(impute(crime$average.ed, mean))
summary(crime)

The missing values in the variable average.ed must replaced with mean. I keep getting error

Error in match.arg(what) : 'arg' must be NULL or a character vector

(BTW mean is 10.51)

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
Rohan
  • 41
  • 4

2 Answers2

3

Finally found the solution to the problem by myself. A package named e1071 was causing a problem when using with package Hmisc. Both the packages have impute function and when used provided when both the packages are active, it won't run. So moral of the story use any one of the package.

Rohan
  • 41
  • 4
1

The answer provided by @Rohan is 100% spot on. Yet, I want to make a few additions for people stuck with the same problem.

TL;DR:
Replace in your code the references to impute with Hmisc::impute(matrix_to_operate_on, replacement_value)

The impute arguments list can be found in this doc, https://www.rdocumentation.org/packages/Hmisc/versions/4.4-1/topics/impute.

For geeks:
The fraudulent/unwanted impute function is the one defined in the library e1071, https://www.rdocumentation.org/packages/e1071/versions/1.7-4/topics/impute.

How to diagnose the problem?
In R there is a introspection functionality called body(), https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/body. This will allow you to view the source of the actual function that you are invoking.

In the problematic case this would look like:

> body(impute)
{
    what <- match.arg(what)
    if (what == "median") {
        retval <- apply(x, 2, function(z) {
            z[is.na(z)] <- median(z, na.rm = TRUE)
            z
        })
    }
    else if (what == "mean") {
        retval <- apply(x, 2, function(z) {
            z[is.na(z)] <- mean(z, na.rm = TRUE)
            z
        })
    }
    retval
}

After investigating the body of the impute function, you are able to make a guess where is the error above coming from, Error in match.arg(what) : 'arg' must be NULL or a character vector

Now you know the problem, so just reference the right function by prepending the correct library Hmisc:: to the impute call.

Good practises:
To avoid such sticky situation in future, always prepend the library where the function is coming from. You never know if there aren't any other R functions which have "evil" twins. From [package] import [function] in R

Konstantin Grigorov
  • 1,356
  • 12
  • 20