How can I compress multiple rows with NA into 3 complete rows?

Question

Could anybody help me writing an R script, melting 9 rows with several NA into 3 complete rows?

My question is similar to: combine rows in data frame containing NA to make complete row

Let's say, I have this data frame (with many more columns):

data <- data.frame('a' = c('1','2','3','4','5','6','7','8','9'),
                   'b' = c(1,2,NA,NA,NA,2,NA,NA,NA),
                   'c' = c(NA,1,3,NA,NA,NA,NA,NA,3),
                   'd' = c(NA,NA,NA,4,5,2,NA,NA,NA),
                   'e' = c(2,NA,NA,NA,NA,3,1,NA,NA),
                   'f' = c(NA,3,NA,NA,NA,NA,NA,5,3))

Edit thanks to your comments: There are always three observations per column except of column 'a' and later on I want to calculate ICCs. Column 'a' can be left out.

Desired Output would look like that:

  b c d e f
1 1 1 4 2 3
2 2 3 5 3 5
3 2 3 2 1 3

I tried several codes recommended for similar problems and I already succesfully used coalesce() for melting two columns into one. However, I couldn't manage to make them work for my data yet. The group_by() function as suggested in the link above does not work in my script, because even if I recode the first vector as c(1,1,1,2,2,2,3,3,3) I did an 'artificial' selection and I still get rows with NAs (if I understood it right).

As I am still new at working with R, please explain it with two or three extra words :)

`a` has more than three observations. Please provide your literal expected output, as it is unclear with the presence of `a`. If you didn't have that column, then you might try `data[] <- lapply(data, na.omit)`, which blindly assumes all columns have the same number of non-`NA` values. — r2evans, Mar 10 '23 at 20:19
@r2evans Thanks a lot for your comment. I actually do have a column like that (a) standing for Rater 1, 2, 3 etc. Are you suggesting to remove that column for the analysis? — Anna Seyffert, Mar 10 '23 at 20:23
No. You said that *"always three observations per column"*, which is not true in the presence of your `a` column. Further, how do you propose one reduce the length of _some_ columns while preserving the number of rows in others? This is where you need to come back with what you think the result should be. — r2evans, Mar 10 '23 at 20:28

score 0 · Answer 1 · answered Mar 10 '23 at 20:29

If we could remove column a, then we could do it this way:

library(dplyr)
library(tidyr)

data %>% 
  select(-a) %>% 
  pivot_longer(everything()) %>% 
  group_by(name) %>% 
  arrange(value, .by_group = TRUE) %>% 
  na.omit() %>% 
  mutate(id = row_number()) %>% 
  pivot_wider(names_from=name, values_from = value) %>% 
  ungroup() %>% 
  select(-id)

    b     c     d     e     f
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     1     2     1     3
2     2     3     4     2     3
3     2     3     5     3     5

Thanks for your quick help. Would you have a solution that keeps the original order (see my edited question)? — Anna Seyffert, Mar 10 '23 at 22:02

Andre Wildberg · Answer 2 · 2023-03-10T22:12:04.597

0

A base R approach using order. Not sure about the desired output though.

data.frame(sapply(data[,-1], function(x) x[order(x)][1:3]))
  b c d e f
1 1 1 2 1 3
2 2 3 4 2 3
3 2 3 5 3 5

If you want to keep the order try

data.frame(sapply(data[,-1], function(x) x[!is.na(x)]))
  b c d e f
1 1 1 4 2 3
2 2 3 5 3 5
3 2 3 2 1 3

edited Mar 10 '23 at 22:12

answered Mar 10 '23 at 20:41

Andre Wildberg

12,344
3
12
29

Thank you so much. Would you have a solution for the edited desired output above as well? – Anna Seyffert Mar 10 '23 at 22:04
I added a different approach that might be what you need. – Andre Wildberg Mar 10 '23 at 22:06
Would you have a hint why in my real data frame it only adds X to the labels of the columns? (like Xb_1, Xb_2, Xb_3 etc.) – Anna Seyffert Mar 10 '23 at 22:28
Maybe you're not starting with a data.frame. Check with class(data) or str(data). – Andre Wildberg Mar 10 '23 at 22:33
It says: [1] "tbl_df" "tbl" "data.frame" – Anna Seyffert Mar 10 '23 at 22:39
Yep, that's a `tibble`. They're mostly compatible but not 100%. You can convert to data.frame e.g with as.data.frame(data). – Andre Wildberg Mar 10 '23 at 22:41

How can I compress multiple rows with NA into 3 complete rows?

2 Answers2