2

I am trying to create a new column (variable) according to the values that appear in an existing column such that if there is an NA in the existing column then the corresponding value in the new column should be 0 (zero), if not NA then it should be 1 (one). An example data is given below:

aid=c(1,2,3,4,5,6,7,8,9,10)
age=c(2,14,NA,0,NA,1,6,9,NA,15)
data=data.frame(aid,age)

My new data frame should look like this:

aid=c(1,2,3,4,5,6,7,8,9,10)
age=c(2,14,NA,0,NA,1,6,9,NA,15)
surv=c(1,1,0,1,0,1,1,1,0,1)
data<-data.frame(aid,age,surv)
data

I hope that my question is clear enough.

The R community's help is highly appreciated!

Baz

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
baz
  • 6,817
  • 11
  • 36
  • 37
  • the 19 rather than 10 in the second version of `aid` is a bit confusing, presumably that's meant to be 10 as well? – mdsumner Mar 02 '11 at 23:58

3 Answers3

8
surv = 1 - is.na(age)


> data
   aid age surv
1    1   2    1
2    2  14    1
3    3  NA    0
4    4   0    1
5    5  NA    0
6    6   1    1
7    7   6    1
8    8   9    1
9    9  NA    0
10  10  15    1
> 
mob
  • 117,087
  • 18
  • 149
  • 283
6

If I'm understanding correctly:

data$surv <- 1
data$surv[is.na(data$age)] <- 0

or

data$surv <- ifelse(is.na(data$age), 0, 1)
Noah
  • 2,574
  • 1
  • 18
  • 12
3

An alternative to @mod's 1-is.na(foo) solution, is to just invert the TRUE/FALSE with !, and call as.numeric(). This involves more typing, but the intention and explicit coercion to numeric is apparent.

> as.numeric(!is.na(c(2,14,NA,0,NA,1,6,9,NA,15)))
 [1] 1 1 0 1 0 1 1 1 0 1
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • Hi Gavin, its good to know various ways of approaching this scenario and definitely would do well for small data sets. I am working with a large one and this might not go well with me....that is the typing part...hahaha! Anyway, thank you for your contribution and is very much appreciated! – baz Mar 04 '11 at 00:57
  • @Poasa the amount of typing would be the same; I could have written `surv <- as.numeric(!is.na(age))` so I used a vector. The extra typing I mentioned was in typing `as.numeric()` as opposed to `1 - ` in @mob's answer. – Gavin Simpson Mar 04 '11 at 10:16
  • yeah! you are right....sorry that I misread your respond. Anyway, many thanks for the great help! – baz Mar 07 '11 at 01:58