0

I have a dataframe with three columns. I would like to populate the NAs that are in one column with values in another column, but I do not want to overwrite any data. How can I get the following results?

# Starting Dataframe:
DF$ST_1 <- c(100, NA, 100, 100, 200, 200, NA, NA, NA, NA, 200)
DF$ST_2 <- c(50,  NA,  50,  50,  12,  NA, NA, 50, 50, NA, 12)
DF$ST_3 <- c(5,   NA,   5,   2,   3,   1,  1,  3,  4,  2, 11)


Results I want:
DF$ST <- c(100, NA,  100, 100, 200, 200, 1, 50, 50, 2, 200)

As you can see, I want to keep all the values in ST_1, and when there is an NA, fill it in with ST_2. Then, I want to keep all of the values from that merge, and fill in the remaining NAs with ST_3. There will still be some leftover NAs after all these merges.

user438383
  • 5,716
  • 8
  • 28
  • 43
  • This isn't anything like a `left_join`. A join matches up two data frames by comparing values in some columns and bringing in new columns. This is called a "coalesce" operation, where multiple columns are combined into 1 column by tacking the first non-missing value. With `dplyr`, `mutate(DF, ST = coalesce(ST_1, ST_2, ST_3))`. – Gregor Thomas Jun 29 '23 at 15:58

2 Answers2

1
library(dplyr)
DF %>%
  mutate(ST=coalesce(ST_1,ST_2,ST_3))

   ST_1 ST_2 ST_3  ST
1   100   50    5 100
2    NA   NA   NA  NA
3   100   50    5 100
4   100   50    2 100
5   200   12    3 200
6   200   NA    1 200
7    NA   NA    1   1
8    NA   50    3  50
9    NA   50    4  50
10   NA   NA    2   2
11  200   12   11 200
one
  • 3,121
  • 1
  • 4
  • 24
0

So you're wanting the max value from each row?

base R:

DF$ST <- apply(DF,1,max,na.rm=TRUE)
stefan_aus_hannover
  • 1,777
  • 12
  • 13