2

I have a dataframe like the following

Sample ID Parameter Value Unit
1 apple 30 g
1 pear 15 lb
1 organge 20 kg
2 apple 2 g
2 pear 3 lb
2 orange 10 kg
3 apple 15 g
3 pear 23 lb
3 orange 12 kg

I would like to transpose it based on the sample ID and put the corresponding value in the column

Sample ID apple_value_unit(g) pear_value_unit(lb) orange_value_unit(kg)
1 30 15 20
2 2 3 10
3 15 23 12

Is there any ways that I can transpose and match the value? I tried the cast(), but it does not work on mapping the value to the corresponding parameter.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213

1 Answers1

0

We can concatenate the 'Parameter' and 'Unit' columns to a single column before doing the reshape to 'wide' format with pivot_wider

library(dplyr)
library(stringr)
library(tidyr)
df1 %>% 
   mutate(Parameter = sprintf('%s_value_unit(%s)', Parameter, Unit),
        .keep = "unused") %>%
   pivot_wider(names_from = Parameter, values_from = Value)

-output

# A tibble: 3 × 4
  SampleID `apple_value_unit(g)` `pear_value_unit(lb)` `orange_value_unit(kg)`
     <int>                 <int>                 <int>                   <int>
1        1                    30                    15                      20
2        2                     2                     3                      10
3        3                    15                    23                      12

data

df1 <- structure(list(SampleID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L
), Parameter = c("apple", "pear", "orange", "apple", "pear", 
"orange", "apple", "pear", "orange"), Value = c(30L, 15L, 20L, 
2L, 3L, 10L, 15L, 23L, 12L), Unit = c("g", "lb", "kg", "g", "lb", 
"kg", "g", "lb", "kg")), class = "data.frame", row.names = c(NA, 
-9L))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you so much! When I run through using my large data set, now it does transpose and show the value+unit, but it does not show the unique sample ID. Is that because i have other columns in my dataset too? Now it shows like | Sample ID| apple_value_unit(g)| pear_value_unit(lb)|orange_value_unit(kg)| | -------- | -------------- | -------- | -------| | 1 | 30 | | | | 1 | | 15 |10 | | 1 | | |20 | – alexissunsmile Oct 14 '21 at 23:54
  • @alexissunsmile It could be because the combinations will be different when other columns are present. You may need to remove the other columns – akrun Oct 14 '21 at 23:55
  • Many thanks! Is there any ways to combine all rows with the same ID together then? Now it seems that I have multiple rows for a single ID. I guess I just need to combine all the rows with same single ID together then. One step away. Any function to recommend? – alexissunsmile Oct 15 '21 at 00:02
  • @alexissunsmile instead of that, you may select onnly the columns of interest i.e. `df1 %>% select(SampleID, Parameter, Value, Unit)%>%mutate(Parameter = sprintf('%s_value_unit(%s)', Parameter, Unit), .keep = "unused") %>% pivot_wider(names_from = Parameter, values_from = Value)` – akrun Oct 15 '21 at 00:25
  • Just one more question, after I got the pivot table, I try to export it by using write.csv, but then I found that it gives me an error message 'Error in utils::write.table(w, "pivot.csv", col.names = NA, sep = ",", : unimplemented type 'list' in 'EncodeElement'' – alexissunsmile Oct 15 '21 at 20:25
  • @alexissunsmile ok bye – akrun Oct 15 '21 at 20:25
  • Sorry for the vague description. But the pivot really works. Thank you so much! I try to cast vote for your answer, but I guess because my reputation is less than 15, so I cannot upvoke – alexissunsmile Oct 15 '21 at 20:32
  • Got it! Thanks! – alexissunsmile Oct 15 '21 at 20:35
  • why not use `write.csv` i.e. `write.csv(out, "pivot.csv", quote = FALSE)` – akrun Oct 15 '21 at 20:35
  • It still gives me an error for "unimplemented type 'list' in 'EncodeElement'. It may because when I pivot the table, there are values that null. For some Sample ID, it does not contain all the parameters. So there are a lot of null values after pivot. When I try to export it, the file I got is just one line with all the row names, and there is no data in it. – alexissunsmile Oct 15 '21 at 20:39
  • 1
    @alexissunsmile that means you have duplicate elements. Try `out <- df1 %>% mutate(Parameter = sprintf('%s_value_unit(%s)', Parameter, Unit), .keep = "unused") %>% mutate(rn = data.table::rowid(Parameter)) %>% pivot_wider(names_from = Parameter, values_from = Value) %>% select(-rn)` – akrun Oct 15 '21 at 20:40