2

I have the following problem. I have a data.frame consisting of country "identifier" (letters+numbers), "year" (numbers), "unique identifier" (identifier+year), statistics on "labour market1" (numbers) and statistics on "labour market2" (numbers), where some data for labour market2 is missing and needs to be interpolated. Once I run the library(imputeTS), I get the following message:

library(imputeTS) Warning message: Unknown or uninitialised column: 'x'.

After running

data <- na.interpolation(data)

I get the following errors:

Warning messages: 1: Unknown or uninitialised column: 'x'. 2: imputeTS: No imputation performed for column 1 because of this Error >in na.interpolation(data[, i], option): Input x is not numeric

3: imputeTS: No imputation performed for column 2 because of this Error >in na.interpolation(data[, i], option): Input x is not numeric

4: imputeTS: No imputation performed for column 3 because of this Error >in na.interpolation(data[, i], option): Input x is not numeric

5: imputeTS: No imputation performed for column 5 because of this Error >in na.interpolation(data[, i], option): Input x is not numeric

What is interesting is that the na.interpolation(data) stopped working after I updated the R version from 3.2.3 to the latest 3.5.1 (2018-07-02) -- "Feather Spray".

I wonder if there is a solution to get rid of the warning and perform the interpolatioN without reverting back to the older version of R.

Thank you in advance!

Ines22
  • 21
  • 4
  • Hard to tell, what is going wrong. I would assume it is rather a problem with your data processing than an error in the package. – Steffen Moritz Aug 24 '18 at 23:12
  • You have to give a reproducible example that we can help you. ( https://stackoverflow.com/help/mcve ) Which means you have to post some example data plus your preprocessing to the question. Because we need to be able run your code and see if the error also occurs for us. You don't need to give the full 'data' data.frame - 2 rows are enough, if the errors already appears with these for you. – Steffen Moritz Aug 24 '18 at 23:17
  • As I was writing the reproducible example, I found my error and that being that I did not apply the function on the column, but rather on the entire set. By calling data$columnID <- na.interpolation(data$columnID) I get the desired output. – Ines22 Aug 28 '18 at 11:36

1 Answers1

1

I guess there is a problem with one of the columns of your dataset. The message clearly says you have a non-numeric column. Have you tried to jump non-numeric columns? I have come across the same issue recently working on element-wise imputation with the imputeTS package. My workaround was to skip the character columns. In my case, I had a list of dataframes representing countries. Some of the dataframes had only the two first columns (country and year) which were characters.

list_imputed_values <- lapply(list_of_dataframes, function(x){
if (ncol(x) == 3) { # apply imputation to the third column only
name <- names(x)[3]
fixed <- x[, 1:2]
imputable <- x[, 3]
imputed <- as.data.frame(imputeTS::na.interpolation(imputable))
names(imputed) <- name
x <- cbind(fixed, imputed)
} else if (ncol(x) == 2) { # do not apply imputation because columns are non-numeric
x <- x[, 1:2]
 } else {  # apply imputation to all non-numeric columns
fixed <- x[, 1:2]
imputable <- x[, 3:ncol(x)]
imputed <- imputeTS::na.interpolation(imputable)
x <- cbind(fixed, imputed)
  }})
LisboaFJG
  • 21
  • 3