0

I have a data frame of live trapping. There are two systems, each with different status. The week and the number of individuals in each status is given.

df <- data.frame (sys = rep(c("a","b"), each=3), 
              sta = rep(c("L","T","P"), times=2), 
              w01 = c("0","2","5","2","2","5"),
              w02 = c("3","2","2","4","6","1"),
              w03 = c("4","1","0","5","3","0"))
df
sys  sta  w01  w02  w03
a    L    0    3    4
a    T    2    2    1
a    P    5    2    0
b    L    2    4    5
b    T    2    6    3
b    P    5    1    0

I want a data frame showing for each number the system and status. The Result would look like this:

sys sta Num
a   L   0
a   L   3
a   L   4
a   T   2
a   T   2
a   T   1
a   P   5
a   P   2
a   P   0
b   L   2
b   L   4
b   L   5
b   T   2
b   T   6
b   T   3
b   P   5
b   P   1
b   P   0

I have absolutely no idea how doing this. I found this two questions/answers Reorganise 2x36 dataframe to a 6x6 dataframe. Dice throw visualisation Reshaping data.frame from wide to long format but cannot find a way to fit it for my problem. Also i tried reshape but direction = "wide" does not give the output I need and I am sucked with the varying part of direction = "long". Here is what I tried, but did not work...

a <- as.vector(colnames(df[-c(1,2)]))
df2 <- reshape(df, idvar = "sys", timevar = "sta", varying = a , direction = "long")

Any suggestions how I can solve this problem? Thank you all and kind regards!

P.S.: I need the "new format" of my dataset for a friedman test (friedman.test). Will it work out in this way?

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
Leips
  • 3
  • 1
  • https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format is exactly what you need, have you tried the other solutions there? –  Mar 29 '17 at 15:00
  • 1
    `df2 <- reshape(df, idvar = c("sys", "sta"), varying = c('w01', 'w02', 'w03'), direction = "long", sep = '', new.row.names = 1:18)` gets you what you need. `reshape` is horrible, though; try to learn the `reshape2` or `tidyr` packages. –  Mar 29 '17 at 15:07
  • Thank you for finding a solution without additional packages! If I need reorganizations like this more often, I will learn to use this packages. But if I just need it once, I am always happier with a solution, that does not require extras. – Leips Mar 29 '17 at 15:22

3 Answers3

0

With tidyr and dplyr:

library(tidyr)
library(dplyr)

df %>% 
    gather(key, value, -sys, -sta) %>% 
    select(-key) %>% 
    arrange(sys, sta)

#>    sys sta value
#> 1    a   L     0
#> 2    a   L     3
#> 3    a   L     4
#> 4    a   P     5
#> 5    a   P     2
#> 6    a   P     0
#> 7    a   T     2
#> 8    a   T     2
#> 9    a   T     1
#> 10   b   L     2
#> 11   b   L     4
#> 12   b   L     5
#> 13   b   P     5
#> 14   b   P     1
#> 15   b   P     0
#> 16   b   T     2
#> 17   b   T     6
#> 18   b   T     3
GGamba
  • 13,140
  • 3
  • 38
  • 47
0

You can use melt from reshape2

melt(df, id.vars = c("sys","sta"), value.name = "num")

#     sys sta variable num
# 1    a   L      w01   0
# 2    a   T      w01   2
# 3    a   P      w01   5
# 4    b   L      w01   2
# 5    b   T      w01   2
# 6    b   P      w01   5
# 7    a   L      w02   3
# 8    a   T      w02   2
# 9    a   P      w02   2
# 10   b   L      w02   4
# 11   b   T      w02   6
# 12   b   P      w02   1
# 13   a   L      w03   4
# 14   a   T      w03   1
# 15   a   P      w03   0
# 16   b   L      w03   5
# 17   b   T      w03   3
# 18   b   P      w03   0

If you want to remove the variable column use

melt(df, id.vars = c("sys","sta"), value.name = "num")[,c(1,2,4)]

Fadwa
  • 1,717
  • 5
  • 26
  • 43
0

Here is a tidyverse approach:

It is part of the tidyr package, included in the tidyverse

library(tidyverse)
out <- gather(df, key = 'week', value = 'number', -sys, -sta)

The output:

   sys sta week number
1    a   L  w01      0
2    a   T  w01      2
3    a   P  w01      5
4    b   L  w01      2
5    b   T  w01      2
6    b   P  w01      5
7    a   L  w02      3
8    a   T  w02      2
9    a   P  w02      2
10   b   L  w02      4
11   b   T  w02      6
12   b   P  w02      1
13   a   L  w03      4
14   a   T  w03      1
15   a   P  w03      0
16   b   L  w03      5
17   b   T  w03      3
18   b   P  w03      0

How it works:

  • gather takes all columns that are not excluded. Exclusion is handled by passing on select statements to the ... argument. In this case, I am excluding sys and sta. As a result, the only column we are gathering our values from is all_w.
  • Furthermore we give gather a name for the key and a name for the value field. The key field will contain the column name that each element (row) of the output was taken from. The value will have the content of the row in that column.
  • I would recommend to keep week as you might later on want to know the exposure time that generated one of your numbers. If not, you can drop it with df %>% select(-week).
  • You can undo this transformation doing spread(out, key = week, value = number). Voila, your old df!
JanLauGe
  • 2,297
  • 2
  • 16
  • 40
  • Thank you for your detailed explanation! This realy helps. I often fail in adapting code, because I have no idea what it is doing. Thank you so much! – Leips Mar 29 '17 at 15:18
  • No worries, my pleasure! I know `gather` and spread can be confusing at first. Just play around with it a little bit to get comfortable. Same for the other `tidyr` functions. And really do use the `tidyverse`, it's the best thing that has happened in R in a decade! You can get started here: https://cran.r-project.org/web/packages/dplyr/vignettes/introduction.html – JanLauGe Mar 29 '17 at 15:21