transpose data in one column based on unique values in another column

Question

I basically would like to do what is described here in r

example data

names<-c("k127_60234", "k127_60234","k127_60234","k127_60234","k127_50234","k127_50234")
values<-c("ko235", "ko123", "ko543", "ko623", "ko443", "ko123")
df <- data.frame (names,values)

and here is what I would like the output to look like - for the record the actual files will be way bigger (up tp 200k) and therefore I cannot define beforehand the number of columns

names<-c("k127_60234", "k127_50234")
values1<-c("ko235", "ko443")
values2<-c("ko123", "ko123")
values3<-c("ko543",NA)
values4<-c("ko623",NA)
df.out <- data.frame (names,values1,values2,values3,values4)

akrun · Answer 1 · 2021-05-12T21:52:18.943

We can use dcast in a single line

library(data.table)
dcast(setDT(df), names ~ paste0('values', rowid(names)))

-output

#       names values1 values2 values3 values4
#1: k127_50234   ko443   ko123    <NA>    <NA>
#2: k127_60234   ko235   ko123   ko543   ko623

Or using tidyverse

library(dplyr)
library(tidyr)
library(stringr)
df %>%
   mutate(nm1 = str_c('values', rowid(names))) %>%
   pivot_wider(names_from = nm1, values_from = values)

-output

# A tibble: 2 x 5
#  names      values1 values2 values3 values4
#  <chr>      <chr>   <chr>   <chr>   <chr>  
#1 k127_60234 ko235   ko123   ko543   ko623  
#2 k127_50234 ko443   ko123   <NA>    <NA>

Or using base R

do.call(rbind, lapply(unstack(df[2:1]), `length<-`, 4))

score 2 · Accepted Answer · answered May 12 '21 at 21:41

library(tidyverse)
df %>%
  group_by(names) %>%
  mutate(variable = str_c("values", row_number())) %>%
  pivot_wider(names_from = variable, values_from = values)

 names      values1 values2 values3 values4
  <chr>      <chr>   <chr>   <chr>   <chr>  
1 k127_60234 ko235   ko123   ko543   ko623  
2 k127_50234 ko443   ko123   NA      NA

In base R you could do:

df1 <- transform(df, time = ave(values, names, FUN = seq))
reshape(df1, idvar = "names", dir="wide", sep="")

       names values1 values2 values3 values4
1 k127_60234   ko235   ko123   ko543   ko623
5 k127_50234   ko443   ko123    <NA>    <NA>

score 0 · Answer 3 · answered May 12 '21 at 21:44

0

This might be helpful:

df %>% tidyr::spread(values, value = values)

answered May 12 '21 at 21:44

ocramest

1
1

transpose data in one column based on unique values in another column

3 Answers3