0

I added a new variable (all zeros) to my old data frame. Now in this new data frame, I have to change the value from 0 to 1 for observations who meet the condition. The condition is on the other variable.

For example, I have variables x,y,z in this new data frame. z is the new variable I just added, they are all zero. If y=some number a, I want z=1.

I try to use a simple for loop to accomplish this, but I have no idea where I did wrong.

for (i==999 in data$y) {
    {data$z==1} 
}
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
user2935184
  • 113
  • 4
  • 11

2 Answers2

2

It seems like you're trying to set data$z to be 1 when data$y is 999, and set it to be 0 otherwise. This can be accomplished with:

data$z = as.numeric(data$y == 999)
josliber
  • 43,891
  • 12
  • 98
  • 133
2

It would have helped if you gave us a reproducible example. I'll create one instead:

df = data.frame(x = sample.int(5, 5),
                y = sample.int(5, 5),
                z = rep(0, 5))

df
  x y z
1 3 3 0
2 4 5 0 
3 2 1 0
4 5 4 0
5 1 2 0

Your problem states that you are trying to change values of df$z when some condition in y is met. In R, the general way to do this is to use subscripts. I highly recommend John Cook's blog post 5 Kinds of Subscripts in R to help understand this; it's one of those things in R that just works differently than most other languages, but when you get the hang of it it becomes very handy.

So in this case:

# where is y==1?
df$y == 1
[1] FALSE FALSE  TRUE FALSE FALSE

We can feed this resulting logical vector into the row index of an expression like df[row, column]

df[df$y == 1, ]
  x y z
3 2 1 0

And if we want to set the value of the "z" column in that row to be something, just type

df[df$y == 1, "z"] = 999
df
  x y   z
1 3 3   0
2 4 5   0
3 2 1 999
4 5 4   0
5 1 2   0
Community
  • 1
  • 1
Ari
  • 1,819
  • 14
  • 22