Merge multiple columns into one and rename values based on column it came from

Question

Input data.frame looks like this:

col1 col2 col3
          1
     1   
1
     1
1

Which can be generated with this: (thanks to @Sotos)

d2 <- data.frame(col1 = c('', 1, '', 1, '', 1), 
                 col2 = c('', '', 1, '', 1, ''), 
                 col3 = c(1, '', '', '', '', ''), stringsAsFactors = FALSE)

And desired output:

col1
3
2
1
2
1

How to merge columns like that? Empty cells are just empty, but solution with NA could be helpful as i can easily fill them.

Sorry for my previous wrong answer- What is the logic you want to implement? Should `col2` always be changed to 2? Are there no other values besides `NA` and `1`? — dario, Mar 10 '20 at 13:09
Exactly...empty. Not `NA`, so empty like `data.frame(col1 = c(1, '', 1, '', 1), col2 = c('', 1, '', 1, ''), stringsAsFactors = FALSE)` — Sotos, Mar 10 '20 at 13:13
Sorry for missing out details. The specific problem i have is five columns that can contain any kind of value. These values are in different columns, however, won't appera in the same row - e.g. first row will have only one value, second row also will have only one value etc. — juststuck, Mar 10 '20 at 13:13
Due to lack of a MRE I'm assuming they are numeric... What do you see? — dario, Mar 10 '20 at 13:14
@juststuck Please give a proper example that represents your full data frame. It is very different with only 2 columns and NAs from more than 2 columns with empty cells instead of NAs — Sotos, Mar 10 '20 at 13:14
Ok. I see. I give up. Sorry for not understanding your question. Was just trying to help you out. My bad :) — dario, Mar 10 '20 at 13:16
@dario no, don't give up. I am also just trying to put 1 + 1 together. It is more unclear from OP side — Sotos, Mar 10 '20 at 13:16
Almost. Just clarify whether you have NAs or empty cells. And post a reproducible example. If they are empty then a reproducible example is: `d2 <- data.frame(col1 = c('', 1, '', 1, '', 1), col2 = c('', '', 1, '', 1, ''), col3 = c(1, '', '', '', '', ''), stringsAsFactors = FALSE)` — Sotos, Mar 10 '20 at 13:19
Is the goal to just sum across rows? This post has been edited so many times it is hard to follow. — Matt, Mar 10 '20 at 13:34

Wimpel · Accepted Answer · 2020-03-10T13:38:02.950

An approach using data.table

#library( data.table )
DT <- data.table( col1 = c(1,NA, 1, NA, 1),
                  col2 = c(NA, 1, NA, 1, NA) )
#    col1 col2
# 1:    1   NA
# 2:   NA    1
# 3:    1   NA
# 4:   NA    1
# 5:    1   NA

#update non-NA values to colnumbers
DT[, c("col1", "col2") := as.data.table( ifelse( is.na(DT), NA, col(DT) ) )]
#final output
DT[, .(col1 = fcoalesce( col1, col2 ) ) ][]
#    col1
# 1:    1
# 2:    2
# 3:    1
# 4:    2
# 5:    1

update with provided sample data

d2 <- data.frame(col1 = c('', 1, '', 1, '', 1), 
                 col2 = c('', '', 1, '', 1, ''), 
                 col3 = c(1, '', '', '', '', ''), stringsAsFactors = FALSE)

setDT(d2)
cols <- names(d2)
#update values to colunumbers
d2[, (cols) := as.data.table( ifelse( d2 == '', NA, col(d2) ) )]
#final output
d2[, .(col1 = fcoalesce( d2 ) ) ][]
#    col1
# 1:    3
# 2:    1
# 3:    2
# 4:    1
# 5:    2
# 6:    1

score 2 · Answer 2 · answered Mar 10 '20 at 13:22

You can use apply and which like:

apply(d2==1, 1, which)
#[1] 3 1 2 1 2 1

or if needet with unlist

unlist(apply(d2==1, 1, which))

Using the data given by @Sotos.

d2 <- data.frame(col1 = c('', 1, '', 1, '', 1)
 , col2 = c('', '', 1, '', 1, '')
 , col3 = c(1, '', '', '', '', ''), stringsAsFactors = FALSE)

Matt · Answer 3 · 2020-03-10T13:31:02.960

1

Here is a tidyverse solution:

df <- df %>%
  replace(is.na(.), 0) %>%
  mutate(sum = rowSums(.[1:3]))

edited Mar 10 '20 at 13:31

answered Mar 10 '20 at 13:15

Matt

7,255
2
12
34

score 0 · Answer 4 · answered Mar 10 '20 at 13:33

A vectorized idea is to stack, replace with name index and update the original df, i.e.

d2_a <- subset(stack(d2[-1]), values == 1)
d2_a$values <- gsub('\\D+', '', d2_a$ind)
d2$col1[d2$col1 == ''] <- rev(d2_a$values)

which gives,

      col1 col2 col3
    1    3         1
    2    1          
    3    2    1     
    4    1          
    5    2    1     
    6    1

Merge multiple columns into one and rename values based on column it came from

4 Answers4

update with provided sample data