1

I have the data like

enter image description here

How can I reshape the data by merge the rows with same rowname and columname like this: enter image description here

r2evans
  • 141,215
  • 6
  • 77
  • 149
Jing
  • 21
  • 1
  • 2
    Please do not post an image of code/data/errors: it cannot be copied or searched (SEO), it breaks screen-readers, and it may not fit well on some mobile devices. Ref: https://meta.stackoverflow.com/a/285557 (and https://xkcd.com/2116/). Please just include the code, console output, or data (e.g., `dput(head(x))` or `data.frame(...)`) directly. – r2evans Jul 26 '20 at 04:53
  • Where do the `.a1` and `.a2` label suffixes come from? What searching have you done? The key words that give results are `reshape long to wide`, and depending on your preferred R dialect, will include `tidyr::pivot_wider` or `data.table::dcast` or `reshape2::...` (I don't know reshape2). – r2evans Jul 26 '20 at 04:55
  • Examples, searching on SO for [`[r] long to wide`](https://stackoverflow.com/search?q=%5Br%5D+long+to+wide) returns: https://stackoverflow.com/q/5890584/3358272 (among many other Q/As), which also references `aggregate` and `tidyr::spread`, though spread is nearing soft-deprecation in favor of `tidyr::pivot_wider`. If you aren't already using `data.table`, then I suggest you look at `tidyr::pivot_wider` first. – r2evans Jul 26 '20 at 05:02
  • 1
    Does this answer your question? [Transpose / reshape dataframe without "timevar" from long to wide format](https://stackoverflow.com/questions/11322801/transpose-reshape-dataframe-without-timevar-from-long-to-wide-format). And this: https://stackoverflow.com/questions/62978811/r-pivot-wider-to-keep-one-id-per-row/62978897#62978897 – Edward Jul 26 '20 at 05:21

3 Answers3

0

Trust you have allele information missing. If added as following to the data:

data['allele']=c('a1','a2','a1','a2')

then following will solve the problem easily:

Basically wide to long, followed by joining columns of SNP and allele and then wide again.

library(tidyr)

long=data %>% gather(snp, value, -c(Pedigree,allele))
long_joined=unite(long, snp, c(snp, allele), remove=TRUE)
spread(long_joined, key = snp, value = value)
0

Maybe you can try aggregate with unlist:

> aggregate(.~P,df,unlist)
  P S1.1 S1.2 S2.1 S2.2
1 a    C    C    G    G
2 b    C    C    T    T

Data

> dput(df)
structure(list(P = c("a", "a", "b", "b"), S1 = c("C", "C", "C", 
"C"), S2 = c("G", "G", "T", "T")), class = "data.frame", row.names = c(NA, 
-4L))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
0

Solution using dplyr which is part of the tidyverse collection of R packages.

library(dplyr)

Data:

    bar <-     "Pedigree       SNP1 SNP2
               'Individual 1'  C    G
               'Individual 1'  C    G
               'Individual 2'  C    T
               'Individual 2'  C    T"

    foo <- read.table(text=bar, header = TRUE)

Code:

foo %>% 
  group_by(Pedigree) %>% 
  mutate(id = row_number()) %>% 
  pivot_wider(names_from = id, values_from = SNP1:SNP2, names_prefix = ".a")

Output:

#> # A tibble: 2 x 5

#> # Groups:   Pedigree [2]

#>   Pedigree     SNP1_.a1 SNP1_.a2 SNP2_.a1 SNP2_.a2
#>   <fct>        <fct>    <fct>    <fct>    <fct>  
#> 1 Individual 1 C        C        G        G       
#> 2 Individual 2 C        C        T        T
```

Created on 2020-07-26 by the reprex package (v0.3.0)

Eric
  • 2,699
  • 5
  • 17