-1

y is expected to be a linear function of predictors x1, x2, ..., xn so I use glm to find a regression but some values of one of parameters (x1, for example) are missing (NA in input data) they are defined, they are just unknown What would be the correct way to use x1 in regression?

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • I believe that the [default glm behavior is to omit NAs](http://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html). – learner Sep 05 '12 at 01:12

2 Answers2

0

Depends on the context of the problem. Some solutions are:

  • omit NA or exclude the instance
  • substitute per a default value (0, average of the others)
Augusto
  • 241
  • 3
  • 5
0

You can replace missing values with Zero using following code

myData[myData == ''] <- 0

Also you can replace them using Row mean or Column mean using following code

for(i in 1:nrow(myData)){
myData[i,is.na(myData[,i])] <- mean(myData[i,], na.rm = TRUE)
}

or

for(i in 1:ncol(myData)){
myData[is.na(myData[,i]), i] <- mean(myData[,i], na.rm = TRUE)
}

If you already have 0 as missing value and you want to replace it with NA, use following code:

myData[myData == 0] <- NA

as discussed here Replace all 0 values to NA

Sarwan Ali
  • 151
  • 1
  • 1
  • 11