1

I have a simplified dataframe:

test <- data.frame(
        x = c(1,2,3,NA,NA,NA),
        y = c(NA, NA, NA, 3, 2, NA),
        a = c(NA, NA, NA, NA, NA, TRUE)
        )

I want to create a new column rating that has the value of the number in either column x or column y. The dataset is such a way that whenever there's a numeric value in x, there's a NA in y. If both columns are NAs, then the value in rating should be NA.

In this case, the expected output is: 1,2,3,3,2,NA

zephryl
  • 14,633
  • 3
  • 11
  • 30
jo_
  • 677
  • 2
  • 11

4 Answers4

4

With coalesce:

library(dplyr)
test %>% 
  mutate(rating = coalesce(x, y))

   x  y    a rating
1  1 NA   NA      1
2  2 NA   NA      2
3  3 NA   NA      3
4 NA  3   NA      3
5 NA  2   NA      2
6 NA NA TRUE     NA
Maël
  • 45,206
  • 3
  • 29
  • 67
1
library(dplyr)
test %>%
  mutate(rating = if_else(is.na(x), 
                          y, x))
   x  y    a rating
1  1 NA   NA      1
2  2 NA   NA      2
3  3 NA   NA      3
4 NA  3   NA      3
5 NA  2   NA      2
6 NA NA TRUE     NA
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
0
test <- data.frame(
 x = c(1,2,3,NA,NA,NA),
  y = c(NA, NA, NA, 3, 2, NA),
  a = c(NA, NA, NA, NA, NA, TRUE)
)

test$rating <- dplyr::coalesce(test$x, test$y)
Dan Adams
  • 4,971
  • 9
  • 28
User 123732
  • 124
  • 9
  • 2
    Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney Dec 24 '22 at 00:27
  • 1
    Ok sorry I'll make note – User 123732 Dec 24 '22 at 01:24
0

Here several solutions.

# Input
test <- data.frame(
  x = c(1,2,3,NA,NA,NA),
  y = c(NA, NA, NA, 3, 2, NA),
  a = c(NA, NA, NA, NA, NA, TRUE)
)

# Base R solution
test$rating <- ifelse(!is.na(test$x), test$x,
                      ifelse(!is.na(test$y), test$y, NA))

# dplyr solution
library(dplyr)
test <- test %>%
  mutate(rating = case_when(!is.na(x) ~ x,
                            !is.na(y) ~ y,
                            TRUE ~ NA_real_))

# data.table solution
library(data.table)
setDT(test)
test[, rating := ifelse(!is.na(x), x, ifelse(!is.na(y), y, NA))]

Created on 2022-12-23 with reprex v2.0.2

Ruam Pimentel
  • 1,288
  • 4
  • 16