0

What am I doing wrong below? I'm trying to create a new column which replaces the values of Col_1 with the value in the matching row of Col_2 IF it matches a certain string, ELSE it returns the values of Col_1.

Many examples exist - mainly this and this but they seem to operate on the 1st column in a dataframe - for which the code below works fine. I have deliberately slotted in a 1st column to reflect the needs of my data:

What I have tried

library(dplyr)
df <- data.frame(cbind('num' = c(1,2,3,4), 
                 'tenure'   = c('Unknown', 'a', 'Unknown', 'b'),
                 'tenure_2' = c('t', 's', 'u', 'v')))

df['new_col'] <- df %>%
  mutate(tenure = ifelse(tenure == "Unknown", tenure_2, tenure))

Why does it select the 1st col? I'm more than happy to hear better solutions I may have overlooked.

Ndharwood
  • 123
  • 3
  • 11
  • 1
    you supply two names (```new_col``` and ```tenure```) for the new column. Try ```df <- df %>% mutate(new_col= ifelse(tenure == "Unknown", tenure_2, tenure))``` – one Mar 15 '23 at 14:41
  • 1
    FYI, `data.frame(cbind(...))` is an anti-pattern. `cbind()` creates a matrix, which can only have one data type, so your numerics are coerced to `character` class. Your code will work better if you delete `cbind()`, `data.frame(num = c(1,2,3,4), tenure = ...)` – Gregor Thomas Mar 15 '23 at 14:42
  • 1
    `data.frame(cbind(a, b))` is a wrong way to create a dataframe. If `a` and `b` are variables with different types, `cbind()` will coerce them into identical types. Use `data.frame(a, b)` instead. – Darren Tsai Mar 15 '23 at 14:44

2 Answers2

2

You're trying to use both base and dplyr syntax at the same time. ifelse() returns a vector, that could be assigned to a single column with df['new_col'] <- ifelse(...). On the other hand, mutate() returns a whole data frame, which you would want to assign as df <- df %>% mutate(...). Either of these will work:

## assigning the column to the data frame
df['new_col'] <- ifelse(df$tenure == "Unknown", df$tenure_2, df$tenure)
# (same thing using `with()` 
df['new_col'] <- with(df, ifelse(tenure == "Unknown", tenure_2, tenure))

## using `mutate` to assign the data frame
df <- df %>%
  mutate(new_col = ifelse(tenure == "Unknown", tenure_2, tenure))
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
0

Using data.table

library(data.table)
 setDT(df)[tenure == "Unknown", tenure := tenure_2]
akrun
  • 874,273
  • 37
  • 540
  • 662