0

I have a subset of my data as below:

data = matrix(nrow = 9, ncol = 6, 
          data = c(75.5,NA,NA,NA,NA,NA,76,NA,NA,1.78,NA,NA,
                   76.5,NA,1.55,2.11,NA,NA, 
                   77,1.2,1.22,3.10,1.34, NA,
                   77.5,  1.3,  1.48,  1.45,  3.67,  1.35, 
                    80,    2.66,  1.35,  NA,    2.47,  2.89, 
                    80.5,  3.36,  NA ,   NA ,   NA ,   3.44 ,
                    90,    NA  ,  NA ,   NA  ,  NA ,   NA,   
                    90.5,  NA ,   NA ,   NA,    NA ,   NA), byrow = T)

data = as.data.frame(data)
rownames(data) <- data$V1
data <- data[,-1]
colnames(data) <- c("11001","11002","11003","11004","11005")

I would like to get the data below. Then make the expected value curve using loess in ggplot2 with the whole sample, but I need to change the data as below. I appreciate any code in R.

enter image description here

Lisa
  • 121
  • 4
  • 1
    Could you convert those text "NA" to real NA in R at first? That's annoying. Try `type.convert(df, as.is = TRUE)` and output the data again. – Darren Tsai May 20 '22 at 14:53

1 Answers1

1

An option is this:

library(dplyr)
library(tidyr)
df %>%
  rename_at("...1",~"Age") %>%
  pivot_longer(-c(Age), values_to = "measure", names_to = "id", values_drop_na =TRUE)

Output:

# A tibble: 18 × 3
     Age id    measure
   <dbl> <chr>   <dbl>
 1  76   id3      1.78
 2  76.5 id2      1.55
 3  76.5 id3      2.11
 4  77   id1      1.2 
 5  77   id2      1.22
 6  77   id3      3.1 
 7  77   id4      1.34
 8  77.5 id1      1.3 
 9  77.5 id2      1.48
10  77.5 id3      1.45
11  77.5 id4      3.67
12  77.5 id5      1.35
13  80   id1      2.66
14  80   id2      1.35
15  80   id4      2.47
16  80   id5      2.89
17  80.5 id1      3.36
18  80.5 id5      3.44
Quinten
  • 35,235
  • 5
  • 20
  • 53
  • but this gives a different output. My output should be like table above in the image. – Lisa May 20 '22 at 15:08
  • 1
    Include `values_drop_na =TRUE` – Onyambu May 20 '22 at 15:10
  • @Nima, I changed the code. Thanks to @onyambu! – Quinten May 20 '22 at 15:12
  • @onyambu sorry, how can I change the code, if the column one including ages is my row name in the data? – Lisa May 20 '22 at 15:45
  • @Nima what do you mean? The code gives exactly what you want. The result of the code and your expected output are exactly the same – Onyambu May 20 '22 at 15:50
  • @onyambu could you please look at the edited data above? – Lisa May 20 '22 at 16:10
  • @Nima just include the rownames into your data as the Age column and run this same code provided here. Note that in that case you do not need the `rename...` part – Onyambu May 20 '22 at 16:14
  • @onyambu could you please write the code that you meant by the data above? I am confused and I get error. I appreciate it. – Lisa May 20 '22 at 16:22
  • 1
    @Nima Well this question is closed. It doesn't accept any more codes. Itry doing `na.omit(cbind(Age = rownames(data), stack(data)))`. Also note that the information provided in this solution is enough for you to determine how to rehspe your data. You need to add the rownames as a column. ie use `rownames_to_column` function then `pivot_longer` This is enough. Unless you do not know R, in that case you need to start learning what functions R and how to use them – Onyambu May 20 '22 at 16:27