0

I'm working with a high dimensional data set that contains about 200 variables. Many of these variables have values of -99 to indicate they are missing values. I want to convert all these missing values to be expressed as NA instead of -99.

I know that you can do something like

df$var1[df$var1 == -99] <- NA

but when you have a very large amount of variables this gets extremely tiring and is super tedious and time consuming. I'm importing my data as a data frame and working with that. Is there some clever for loop construction I could do or some nice package/command I could utilize? I'm still a bit new to programming in RStudio. Thanks!

markus
  • 25,843
  • 5
  • 39
  • 58
  • You might specify that when you import the data. At least from `data.table::fread` I know that you can supply a vector of values to be interpreted as `NA`. The argument is called `na.strings` – markus Jul 17 '20 at 14:37

1 Answers1

2

Try this to solve your issue with df:

df[df==-99]<-NA
Duck
  • 39,058
  • 13
  • 42
  • 84