3

I have a table, shown as below. I need to pivot it so that data is spread horizontally and store it in a txt file with sep = space. But since the number of items per category differs, I am not sure what data structure is the best to use.

I have been stuck with this for 3 days. Could anyone please kindly help me?

Many thanks

data.frame(
 name=c(9073050,9073050,9073050,9073050,9073050,9073050,9073050,9073050,9079078,9079078,9085376,9085376,9085376,9085376,9085376),
 sequence=c(1,15, 1,10,1, 4, 16,7,18,19,22,14,23,14,24)
)
    name sequence
1  9073050        1
2  9073050       15
3  9073050        1
4  9073050       10
5  9073050        1
6  9073050        4
7  9073050       16
8  9073050        7
9  9079078       18
10 9079078       19
11 9085376       22
12 9085376       14
13 9085376       23
14 9085376       14
15 9085376       24
9073050       1      15       1      10       1       4      16       7
9079078      18      19      
9085376      22      14      23      14      24
Waldi
  • 39,242
  • 6
  • 30
  • 78
RBG
  • 39
  • 2

3 Answers3

4

You can directly store it in a .txt file like this

sink('Output.txt')
for (id in unique(df$name)){
  seq <- df[which(df$name == id), 'sequence']
  cat(id, seq, '\n')
}
sink()

Output.txt then looks like this:

9073050 1 15 1 10 1 4 16 7 
9079078 18 19 
9085376 22 14 23 14 24 

And for completeness, if you also want to store it within R a list is probably the most convenient data structure:

lst <- list()
for (id in unique(df$name)){
  seq <- df[which(df$name == id), 'sequence']
  lst[[as.character(id)]] = seq
}

which yields

> lst
$`9073050`
[1]  1 15  1 10  1  4 16  7

$`9079078`
[1] 18 19

$`9085376`
[1] 22 14 23 14 24
tivd
  • 750
  • 3
  • 17
  • Wow. Thank you soooo much for your help! It works!!! – RBG Feb 06 '22 at 13:38
  • Actually, it's not working when I substiute my own data. It returns an error message "argument 2 (type 'list') cannot be handled by 'cat'". But the argument 2 is a data frame though. I don't understand what the problem is here. – RBG Feb 06 '22 at 13:51
  • A data frame has type 'list'. However `seq` should be a vector if your data.frame has the same structure as in this example. Maybe you can include (a subset of) your dataset using `dput(your_dataset)`. – tivd Feb 06 '22 at 14:01
  • Hi tivd, thanks for your response. I unsed unlist(seq) and it worked perfected. Many thanks! – RBG Feb 06 '22 at 14:31
2

Consider reshape to transform long to wide format after creating a counter variable by group:

long_df <- data.frame(
 name = c(rep(9073050, 8), rep(9079078,2), rep(9085376,5)),
 sequence = c(1,15,1,10,1,4,16,7,18,19,22,14,23,14,24)
)

wide_df <- transform(
    long_df,
    counter = ave(sequence, name, FUN=seq_along)
) |> reshape(
    idvar = "name",
    v.names = "sequence",
    timevar = "counter",
    sep = "_",
    direction = "wide"
)

# EXPORT TAB DELIMITED FILE WITH NO HEADERS
write.table(
    wide_df, file="output.txt", na="", sep="\t",
    row.names=FALSE, col.names=FALSE
)

# 9073050   1   15  1   10  1   4   16  7
# 9079078   18  19                      
# 9085376   22  14  23  14  24          
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Hi Parfait, thanks for your answer but when I tried it in R, it shows an error message. The ' |>' seems to be incorrect there? – RBG Feb 06 '22 at 14:51
  • The `|>` operator was introduced relatively recently, in R 4.1.0. If your version of R is older, you won't have it. It is similar to the extremely popular `%>%` operator from the magrittr package. – jdobres Feb 06 '22 at 15:55
  • If your R version does not support the new pipe, `|>`, nest `transform` inside `reshape`: `wide_df <- reshape(transform(...), ...)` – Parfait Feb 06 '22 at 18:12
0

Here is an alternative approach using pivot_wider:

library(dplyr)
library(tidyr)

df1 <- df %>% 
  group_by(name) %>% 
  mutate(id = row_number()) %>% 
  pivot_wider(
    names_from = id,
    values_from = sequence
  ) 

write.table(df1, file = "df1.txt", sep = "\t",
            row.names = FALSE, col.names = FALSE)

9073050 1   15  1   10  1   4   16  7
9079078 18  19  NA  NA  NA  NA  NA  NA
9085376 22  14  23  14  24  NA  NA  NA
TarJae
  • 72,363
  • 6
  • 19
  • 66