-1

I want to change the values of multiple variables, based on the condition of another variable

Something like:

df <- iris
df$index <- row.names(df)

if(df$index == 10){
  df$Species <- "test";
  df$Sepal.Length <- 100
}

So if the index value is 10, then I want to change Species to "test" AND change sepal.length to 100.

Instead I get a warning:

Warning message:
In if (df$index == 10) { :
  the condition has length > 1 and only the first element will be used

And the variables remain unchanged.

B_Real
  • 177
  • 2
  • 12

3 Answers3

3

Currently, all your expressions maintain different lengths on both sides of equality == or assignment operator, <-. Specifically:

  • Here, if(df$index == 10) compares ALL values of the vector, df$index, with one value, 10 which returns a logical vector with only one TRUE as 10th element: [FALSE, FALSE, FALSE, ..., TRUE, FALSE, FALSE, FALSE ...]. Check by print(df$index == 10).

    Hence the warning to use only the first value: FALSE. Subsequently, NO values are updated since if returns FALSE.

  • Here, df$Species <- "test" is overwriting ALL values (i.e., all rows) of df$Species with one value, "test". But this is ignored since if returns FALSE.

  • Here, df$Sepal.Length <- 100 is overwriting ALL values (i.e., all rows) of df$Sepal.Length with one value, 100. But this is ignored since if returns FALSE.

Likely, you meant to update single row values by index which can be handled without any if logic or new column from row.names by simply indexing the vectors and then re-assigning single values accordingly:

df$Species[10] <- "test"
df$Sepal.Length[10] <- 100
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Does the job. I was kinda looking for something a bit more like the 'DO IF' statements in SPSS which go like this: DO IF (gender EQ 'm'). + COMPUTE score1 = (2*q1)+q2. + COMPUTE score2 = (3*q1)+q2. – B_Real Apr 28 '19 at 22:17
  • Use the vectorized `ifelse`: `df$Species <- ifelse(df$index == 10, "test", df$Species)` where the only *TRUE* in 10th position will be converted, all else the same. – Parfait Apr 29 '19 at 02:24
1

The if statement you're using looks like it would work in a for loop. df$index == 10 returns a vector so the errors says the if statement will only proceed with the first element of that vector. The solution below should work. subset is the data for which the filter is true, then manipulate that data frame. Then remove this data and attach the manipulated subset to the bottom of the data frame. This will make sure that all of your observations remain in your dataset after being altered, but it will not guarantee that the observations stay in the same order.

library(tidyverse)
df <- iris
df$index <- row.names(df)


subset <- df[df$index == 10, ]
subset$Species <- "test"
subset$Sepal.Length <- 100

df <- df[df$index != 10, ] %>%
  rbind(subset)
mcz
  • 587
  • 2
  • 10
0

I think this answer might be more flexible for you going forward. It uses tidyverse which you can learn more about here: https://r4ds.had.co.nz/introduction.html

library(tidyverse)
# specify condition if you want to use multiple times
y <- df$index == 10

df <- df %>% # this is a pipe. It plugs df into the next function, which is mutate
  # mutate modifies variables in the df
 mutate(
   Species = 
 # case when can handle many conditions, though we just have one here
     case_when(
       y ~ "test",
    # TRUE means if the condition is not met (or something like that, and we just return the original value)
       TRUE ~ as.character(Species)),
 # we treat each variable separately
   Sepal.Length = 
     case_when(
       y ~ 100,
       TRUE ~ as.double(Sepal.Length))
 )
Nick
  • 417
  • 4
  • 14