0

Could anybody help me writing an R script, melting 9 rows with several NA into 3 complete rows?

My question is similar to: combine rows in data frame containing NA to make complete row

Let's say, I have this data frame (with many more columns):

data <- data.frame('a' = c('1','2','3','4','5','6','7','8','9'),
                   'b' = c(1,2,NA,NA,NA,2,NA,NA,NA),
                   'c' = c(NA,1,3,NA,NA,NA,NA,NA,3),
                   'd' = c(NA,NA,NA,4,5,2,NA,NA,NA),
                   'e' = c(2,NA,NA,NA,NA,3,1,NA,NA),
                   'f' = c(NA,3,NA,NA,NA,NA,NA,5,3))

Edit thanks to your comments: There are always three observations per column except of column 'a' and later on I want to calculate ICCs. Column 'a' can be left out.

Desired Output would look like that:

  b c d e f
1 1 1 4 2 3
2 2 3 5 3 5
3 2 3 2 1 3

I tried several codes recommended for similar problems and I already succesfully used coalesce() for melting two columns into one. However, I couldn't manage to make them work for my data yet. The group_by() function as suggested in the link above does not work in my script, because even if I recode the first vector as c(1,1,1,2,2,2,3,3,3) I did an 'artificial' selection and I still get rows with NAs (if I understood it right).

As I am still new at working with R, please explain it with two or three extra words :)

  • 3
    `a` has more than three observations. Please provide your literal expected output, as it is unclear with the presence of `a`. If you didn't have that column, then you might try `data[] <- lapply(data, na.omit)`, which blindly assumes all columns have the same number of non-`NA` values. – r2evans Mar 10 '23 at 20:19
  • @r2evans Thanks a lot for your comment. I actually do have a column like that (a) standing for Rater 1, 2, 3 etc. Are you suggesting to remove that column for the analysis? – Anna Seyffert Mar 10 '23 at 20:23
  • No. You said that *"always three observations per column"*, which is not true in the presence of your `a` column. Further, how do you propose one reduce the length of _some_ columns while preserving the number of rows in others? This is where you need to come back with what you think the result should be. – r2evans Mar 10 '23 at 20:28

2 Answers2

0

If we could remove column a, then we could do it this way:

library(dplyr)
library(tidyr)

data %>% 
  select(-a) %>% 
  pivot_longer(everything()) %>% 
  group_by(name) %>% 
  arrange(value, .by_group = TRUE) %>% 
  na.omit() %>% 
  mutate(id = row_number()) %>% 
  pivot_wider(names_from=name, values_from = value) %>% 
  ungroup() %>% 
  select(-id)
    b     c     d     e     f
  <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     1     2     1     3
2     2     3     4     2     3
3     2     3     5     3     5
TarJae
  • 72,363
  • 6
  • 19
  • 66
0

A base R approach using order. Not sure about the desired output though.

data.frame(sapply(data[,-1], function(x) x[order(x)][1:3]))
  b c d e f
1 1 1 2 1 3
2 2 3 4 2 3
3 2 3 5 3 5

If you want to keep the order try

data.frame(sapply(data[,-1], function(x) x[!is.na(x)]))
  b c d e f
1 1 1 4 2 3
2 2 3 5 3 5
3 2 3 2 1 3
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29