0

I have survey data for a group of species that are in a format similar to this:

## Source data     Latitude    Longitude    Date      Trap_type   Hab_type  Sp1  Sp2  Sp3  Sp4  Sp5
#      source        lat1         long1     ??/??/??     Trap_1    Hab_1     Y     N    Y    N    N  
#      source        lat2         long2     ??/??/??     Trap_1    Hab_1     N     N    N    Y    N  

But I would like to have long format data more like:

## Source data     Latitude    Longitude     Date            Trap_type   Hab_type  Species Present
#      source      lat1        long1         ??/??/??        Trap_1      Hab_1      Sp1       Y           
#      source      lat1        long1         ??/??/??        Trap_1      Hab_1      Sp2       N  
#      source      lat1        long1         ??/??/??        Trap_1      Hab_1      Sp3       Y  
#      source      lat1        long1         ??/??/??        Trap_1      Hab_1      Sp4       N  
#      source      lat1        long1         ??/??/??        Trap_1      Hab_1      Sp5       N
#      source      lat2        long2         ??/??/??        Trap_1      Hab_1      Sp1       N           
#      source      lat2        long2         ??/??/??        Trap_1      Hab_1      Sp2       N  
#      source      lat2        long2         ??/??/??        Trap_1      Hab_1      Sp3       N  
#      source      lat2        long2         ??/??/??        Trap_1      Hab_1      Sp4       Y  
#      source      lat2        long2         ??/??/??        Trap_1      Hab_1      Sp5       N

The actual data covers ~ 14,000 collection points and ~25 species so I need a more automated fix than converting by hand - I feel like this should be simple but I can't work out how to do it. I've been trying to use an excel pivot table without much success and was hoping for a solution in either R or Excel in order to reformat this data if anybody is able to help! Thank you.

westpier
  • 45
  • 3

1 Answers1

0

I think you are looking for tidyr::pivot_longer, with the columns pivoted if their name contains Sp:

library(tidyr)
library(dplyr)

df %>% pivot_longer(cols = contains("Sp"))

#> # A tibble: 10 x 8
#>    `Source data` Latitude Longitude Date       Trap_type Hab_type name  value
#>    <chr>         <chr>    <chr>     <chr>      <chr>     <chr>    <chr> <chr>
#>  1 source        lat1     long1     2020-07-01 Trap_1    Hab_1    Sp1   Y    
#>  2 source        lat1     long1     2020-07-01 Trap_1    Hab_1    Sp2   N    
#>  3 source        lat1     long1     2020-07-01 Trap_1    Hab_1    Sp3   Y    
#>  4 source        lat1     long1     2020-07-01 Trap_1    Hab_1    Sp4   N    
#>  5 source        lat1     long1     2020-07-01 Trap_1    Hab_1    Sp5   N    
#>  6 source        lat2     long2     2020-6-01  Trap_1    Hab_1    Sp1   N    
#>  7 source        lat2     long2     2020-6-01  Trap_1    Hab_1    Sp2   N    
#>  8 source        lat2     long2     2020-6-01  Trap_1    Hab_1    Sp3   N    
#>  9 source        lat2     long2     2020-6-01  Trap_1    Hab_1    Sp4   Y    
#> 10 source        lat2     long2     2020-6-01  Trap_1    Hab_1    Sp5   N

Data

df <- structure(list(`Source data` = c("source", "source"), Latitude = c("lat1", 
"lat2"), Longitude = c("long1", "long2"), Date = c("2020-07-01", 
 "2020-6-01"), Trap_type = c("Trap_1", "Trap_1"), Hab_type = c("Hab_1", 
  "Hab_1"), Sp1 = c("Y", "N"), Sp2 = c("N", "N"), Sp3 = c("Y", 
  "N"), Sp4 = c("N", "Y"), Sp5 = c("N", "N")), class = "data.frame",
row.names = c(NA,  -2L))

Created on 2020-07-15 by the reprex package (v0.3.0)

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87